Overview

Dataset statistics

Number of variables60
Number of observations11746
Missing cells14219
Missing cells (%)2.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 MiB
Average record size in memory480.0 B

Variable types

Numeric12
Categorical47
Boolean1

Warnings

DOF Benchmarking Submission Status has constant value "In Compliance" Constant
Property Name has a high cardinality: 11740 distinct values High cardinality
Parent Property Id has a high cardinality: 102 distinct values High cardinality
Parent Property Name has a high cardinality: 103 distinct values High cardinality
BBL - 10 digits has a high cardinality: 11580 distinct values High cardinality
NYC Borough, Block and Lot (BBL) self-reported has a high cardinality: 11582 distinct values High cardinality
NYC Building Identification Number (BIN) has a high cardinality: 11508 distinct values High cardinality
Address 1 (self-reported) has a high cardinality: 11645 distinct values High cardinality
Address 2 has a high cardinality: 177 distinct values High cardinality
Postal Code has a high cardinality: 286 distinct values High cardinality
Street Number has a high cardinality: 4198 distinct values High cardinality
Street Name has a high cardinality: 2024 distinct values High cardinality
Primary Property Type - Self Selected has a high cardinality: 55 distinct values High cardinality
List of All Property Use Types at Property has a high cardinality: 813 distinct values High cardinality
Largest Property Use Type has a high cardinality: 54 distinct values High cardinality
Largest Property Use Type - Gross Floor Area (ft²) has a high cardinality: 9484 distinct values High cardinality
2nd Largest Property Use Type has a high cardinality: 59 distinct values High cardinality
2nd Largest Property Use - Gross Floor Area (ft²) has a high cardinality: 2264 distinct values High cardinality
3rd Largest Property Use Type - Gross Floor Area (ft²) has a high cardinality: 964 distinct values High cardinality
ENERGY STAR Score has a high cardinality: 101 distinct values High cardinality
Site EUI (kBtu/ft²) has a high cardinality: 1959 distinct values High cardinality
Weather Normalized Site EUI (kBtu/ft²) has a high cardinality: 1944 distinct values High cardinality
Weather Normalized Site Electricity Intensity (kWh/ft²) has a high cardinality: 441 distinct values High cardinality
Weather Normalized Site Natural Gas Intensity (therms/ft²) has a high cardinality: 66 distinct values High cardinality
Weather Normalized Source EUI (kBtu/ft²) has a high cardinality: 2795 distinct values High cardinality
Fuel Oil #2 Use (kBtu) has a high cardinality: 1906 distinct values High cardinality
Fuel Oil #4 Use (kBtu) has a high cardinality: 1180 distinct values High cardinality
Fuel Oil #5 & 6 Use (kBtu) has a high cardinality: 259 distinct values High cardinality
District Steam Use (kBtu) has a high cardinality: 927 distinct values High cardinality
Natural Gas Use (kBtu) has a high cardinality: 10155 distinct values High cardinality
Weather Normalized Site Natural Gas Use (therms) has a high cardinality: 9632 distinct values High cardinality
Electricity Use - Grid Purchase (kBtu) has a high cardinality: 11406 distinct values High cardinality
Weather Normalized Site Electricity (kWh) has a high cardinality: 10879 distinct values High cardinality
Total GHG Emissions (Metric Tons CO2e) has a high cardinality: 7818 distinct values High cardinality
Direct GHG Emissions (Metric Tons CO2e) has a high cardinality: 5968 distinct values High cardinality
Indirect GHG Emissions (Metric Tons CO2e) has a high cardinality: 5853 distinct values High cardinality
Water Use (All Water Sources) (kgal) has a high cardinality: 7230 distinct values High cardinality
Water Intensity (All Water Sources) (gal/ft²) has a high cardinality: 5607 distinct values High cardinality
Source EUI (kBtu/ft²) has a high cardinality: 2920 distinct values High cardinality
Release Date has a high cardinality: 3537 distinct values High cardinality
NTA has a high cardinality: 144 distinct values High cardinality
Order is highly correlated with Council DistrictHigh correlation
DOF Gross Floor Area is highly correlated with Property GFA - Self-Reported (ft²)High correlation
Property GFA - Self-Reported (ft²) is highly correlated with DOF Gross Floor AreaHigh correlation
Latitude is highly correlated with Longitude and 1 other fieldsHigh correlation
Longitude is highly correlated with LatitudeHigh correlation
Council District is highly correlated with Order and 1 other fieldsHigh correlation
Order is highly correlated with Council District and 1 other fieldsHigh correlation
DOF Gross Floor Area is highly correlated with Property GFA - Self-Reported (ft²)High correlation
Property GFA - Self-Reported (ft²) is highly correlated with DOF Gross Floor AreaHigh correlation
Latitude is highly correlated with LongitudeHigh correlation
Longitude is highly correlated with Latitude and 1 other fieldsHigh correlation
Council District is highly correlated with Order and 1 other fieldsHigh correlation
Census Tract is highly correlated with Order and 2 other fieldsHigh correlation
Order is highly correlated with Council DistrictHigh correlation
DOF Gross Floor Area is highly correlated with Property GFA - Self-Reported (ft²)High correlation
Property GFA - Self-Reported (ft²) is highly correlated with DOF Gross Floor AreaHigh correlation
Council District is highly correlated with OrderHigh correlation
Borough is highly correlated with Order and 5 other fieldsHigh correlation
2nd Largest Property Use Type is highly correlated with Largest Property Use Type and 1 other fieldsHigh correlation
Diesel #2 Use (kBtu) is highly correlated with Weather Normalized Site Natural Gas Intensity (therms/ft²) and 1 other fieldsHigh correlation
DOF Gross Floor Area is highly correlated with Property GFA - Self-Reported (ft²)High correlation
Largest Property Use Type is highly correlated with 2nd Largest Property Use Type and 5 other fieldsHigh correlation
Order is highly correlated with Borough and 6 other fieldsHigh correlation
Longitude is highly correlated with Borough and 6 other fieldsHigh correlation
Council District is highly correlated with Borough and 6 other fieldsHigh correlation
Latitude is highly correlated with Borough and 6 other fieldsHigh correlation
Community Board is highly correlated with Borough and 4 other fieldsHigh correlation
Census Tract is highly correlated with Borough and 2 other fieldsHigh correlation
Number of Buildings - Self-reported is highly correlated with Property GFA - Self-Reported (ft²)High correlation
Weather Normalized Site Natural Gas Intensity (therms/ft²) is highly correlated with Diesel #2 Use (kBtu) and 3 other fieldsHigh correlation
Primary Property Type - Self Selected is highly correlated with Largest Property Use Type and 3 other fieldsHigh correlation
Property GFA - Self-Reported (ft²) is highly correlated with DOF Gross Floor Area and 1 other fieldsHigh correlation
3rd Largest Property Use Type is highly correlated with 2nd Largest Property Use Type and 1 other fieldsHigh correlation
Borough is highly correlated with DOF Benchmarking Submission StatusHigh correlation
2nd Largest Property Use Type is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Metered Areas (Energy) is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Fuel Oil #1 Use (kBtu) is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Diesel #2 Use (kBtu) is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Metered Areas (Water) is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Weather Normalized Site Natural Gas Intensity (therms/ft²) is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Water Required? is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Primary Property Type - Self Selected is highly correlated with Largest Property Use Type and 1 other fieldsHigh correlation
3rd Largest Property Use Type is highly correlated with DOF Benchmarking Submission StatusHigh correlation
Largest Property Use Type is highly correlated with Primary Property Type - Self Selected and 1 other fieldsHigh correlation
DOF Benchmarking Submission Status is highly correlated with Borough and 10 other fieldsHigh correlation
Street Number has 124 (1.1%) missing values Missing
Street Name has 122 (1.0%) missing values Missing
Borough has 118 (1.0%) missing values Missing
DOF Gross Floor Area has 118 (1.0%) missing values Missing
Water Required? has 118 (1.0%) missing values Missing
Latitude has 2263 (19.3%) missing values Missing
Longitude has 2263 (19.3%) missing values Missing
Community Board has 2263 (19.3%) missing values Missing
Council District has 2263 (19.3%) missing values Missing
Census Tract has 2263 (19.3%) missing values Missing
NTA has 2263 (19.3%) missing values Missing
Number of Buildings - Self-reported is highly skewed (γ1 = 26.43633489) Skewed
Property Name is uniformly distributed Uniform
BBL - 10 digits is uniformly distributed Uniform
NYC Borough, Block and Lot (BBL) self-reported is uniformly distributed Uniform
Address 1 (self-reported) is uniformly distributed Uniform
Order has unique values Unique
Property Id has unique values Unique

Reproduction

Analysis started2021-07-25 07:19:29.179980
Analysis finished2021-07-25 07:20:31.843568
Duration1 minute and 2.66 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Order
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct11746
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7185.759578
Minimum1
Maximum14993
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:32.024923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile640.25
Q13428.25
median6986.5
Q311054.5
95-th percentile14023.25
Maximum14993
Range14992
Interquartile range (IQR)7626.25

Descriptive statistics

Standard deviation4323.859984
Coefficient of variation (CV)0.601726225
Kurtosis-1.227087483
Mean7185.759578
Median Absolute Deviation (MAD)3793
Skewness0.08876040278
Sum84403932
Variance18695765.17
MonotonicityNot monotonic
2021-07-25T00:20:32.442837image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
54561
 
< 0.1%
95501
 
< 0.1%
136441
 
< 0.1%
34031
 
< 0.1%
13541
 
< 0.1%
115911
 
< 0.1%
95421
 
< 0.1%
136361
 
< 0.1%
74891
 
< 0.1%
Other values (11736)11736
99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
101
< 0.1%
111
< 0.1%
121
< 0.1%
ValueCountFrequency (%)
149931
< 0.1%
149921
< 0.1%
149911
< 0.1%
149901
< 0.1%
149891
< 0.1%
149881
< 0.1%
149871
< 0.1%
149861
< 0.1%
149851
< 0.1%
149841
< 0.1%

Property Id
Real number (ℝ≥0)

UNIQUE

Distinct11746
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3642958.096
Minimum7365
Maximum5991312
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:32.707099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum7365
5-th percentile2616220.25
Q12747221.5
median3236403.5
Q34409091.75
95-th percentile5833865
Maximum5991312
Range5983947
Interquartile range (IQR)1661870.25

Descriptive statistics

Standard deviation1049069.665
Coefficient of variation (CV)0.2879719277
Kurtosis-0.574975941
Mean3642958.096
Median Absolute Deviation (MAD)581772.5
Skewness0.5561643026
Sum4.279018579 × 1010
Variance1.100547162 × 1012
MonotonicityNot monotonic
2021-07-25T00:20:32.907562image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57933171
 
< 0.1%
29520051
 
< 0.1%
43758821
 
< 0.1%
40379611
 
< 0.1%
44065971
 
< 0.1%
30965541
 
< 0.1%
26657911
 
< 0.1%
27456621
 
< 0.1%
44557411
 
< 0.1%
31265501
 
< 0.1%
Other values (11736)11736
99.9%
ValueCountFrequency (%)
73651
< 0.1%
81391
< 0.1%
86041
< 0.1%
88411
< 0.1%
118091
< 0.1%
132861
< 0.1%
284001
< 0.1%
284021
< 0.1%
284041
< 0.1%
10253391
< 0.1%
ValueCountFrequency (%)
59913121
< 0.1%
59908521
< 0.1%
59908441
< 0.1%
59907331
< 0.1%
59901471
< 0.1%
59901261
< 0.1%
59900931
< 0.1%
59900661
< 0.1%
59900231
< 0.1%
59900161
< 0.1%

Property Name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct11740
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Fairchild
 
2
Jetro Cash && Carry
 
2
Clinton West Condominium
 
2
Club Quarters Hotel
 
2
East Building
 
2
Other values (11735)
11736 

Length

Max length78
Median length22
Mean length22.81210625
Min length3

Characters and Unicode

Total characters267951
Distinct characters85
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11734 ?
Unique (%)99.9%

Sample

1st row201/205
2nd rowNYP Columbia (West Campus)
3rd rowMSCHoNY North
4th rowHerbert Irving Pavilion & Millstein Hospital
5th rowNeuro Institute

Common Values

ValueCountFrequency (%)
Fairchild2
 
< 0.1%
Jetro Cash && Carry2
 
< 0.1%
Clinton West Condominium2
 
< 0.1%
Club Quarters Hotel2
 
< 0.1%
East Building2
 
< 0.1%
Main Hospital2
 
< 0.1%
Bnos Zion of Bobov1
 
< 0.1%
377 Broadway Condominiums1
 
< 0.1%
446 Kingston Owners C1
 
< 0.1%
533 West 112th Street1
 
< 0.1%
Other values (11730)11730
99.9%

Length

2021-07-25T00:20:33.434153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3016
 
6.2%
street2502
 
5.1%
avenue1630
 
3.3%
ave1297
 
2.7%
west1191
 
2.4%
st1105
 
2.3%
east1099
 
2.3%
llc733
 
1.5%
corp476
 
1.0%
park472
 
1.0%
Other values (9064)35168
72.2%

Most occurring characters

ValueCountFrequency (%)
37035
 
13.8%
e21760
 
8.1%
t16617
 
6.2%
r11637
 
4.3%
a10399
 
3.9%
n9968
 
3.7%
o8328
 
3.1%
18187
 
3.1%
s7600
 
2.8%
06285
 
2.3%
Other values (75)130135
48.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter130396
48.7%
Decimal Number46438
 
17.3%
Uppercase Letter44182
 
16.5%
Space Separator37050
 
13.8%
Dash Punctuation5168
 
1.9%
Other Punctuation3211
 
1.2%
Open Punctuation659
 
0.2%
Close Punctuation659
 
0.2%
Connector Punctuation187
 
0.1%
Math Symbol1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S5720
12.9%
A5244
11.9%
C3734
 
8.5%
E3419
 
7.7%
L2688
 
6.1%
W2368
 
5.4%
P2317
 
5.2%
R2109
 
4.8%
B2016
 
4.6%
M1790
 
4.1%
Other values (16)12777
28.9%
Lowercase Letter
ValueCountFrequency (%)
e21760
16.7%
t16617
12.7%
r11637
8.9%
a10399
 
8.0%
n9968
 
7.6%
o8328
 
6.4%
s7600
 
5.8%
i6087
 
4.7%
l5408
 
4.1%
h4756
 
3.6%
Other values (16)27836
21.3%
Other Punctuation
ValueCountFrequency (%)
.1101
34.3%
:1038
32.3%
,561
17.5%
&232
 
7.2%
/168
 
5.2%
#59
 
1.8%
'38
 
1.2%
;7
 
0.2%
?2
 
0.1%
"2
 
0.1%
Other values (3)3
 
0.1%
Decimal Number
ValueCountFrequency (%)
18187
17.6%
06285
13.5%
25602
12.1%
55184
11.2%
34764
10.3%
44177
9.0%
63319
7.1%
73315
7.1%
83103
 
6.7%
92502
 
5.4%
Space Separator
ValueCountFrequency (%)
37035
> 99.9%
 15
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
(658
99.8%
[1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
)658
99.8%
]1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
-5166
> 99.9%
2
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_187
100.0%
Math Symbol
ValueCountFrequency (%)
+1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin174578
65.2%
Common93373
34.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e21760
 
12.5%
t16617
 
9.5%
r11637
 
6.7%
a10399
 
6.0%
n9968
 
5.7%
o8328
 
4.8%
s7600
 
4.4%
i6087
 
3.5%
S5720
 
3.3%
l5408
 
3.1%
Other values (42)71054
40.7%
Common
ValueCountFrequency (%)
37035
39.7%
18187
 
8.8%
06285
 
6.7%
25602
 
6.0%
55184
 
5.6%
-5166
 
5.5%
34764
 
5.1%
44177
 
4.5%
63319
 
3.6%
73315
 
3.6%
Other values (23)10339
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII267934
> 99.9%
Latin 1 Sup15
 
< 0.1%
Punctuation2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
37035
 
13.8%
e21760
 
8.1%
t16617
 
6.2%
r11637
 
4.3%
a10399
 
3.9%
n9968
 
3.7%
o8328
 
3.1%
18187
 
3.1%
s7600
 
2.8%
06285
 
2.3%
Other values (73)130118
48.6%
Latin 1 Sup
ValueCountFrequency (%)
 15
100.0%
Punctuation
ValueCountFrequency (%)
2
100.0%

Parent Property Id
Categorical

HIGH CARDINALITY

Distinct102
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Applicable: Standalone Property
11324 
3612678
 
57
3616399
 
33
4985858
 
20
4442823
 
13
Other values (97)
 
299

Length

Max length35
Median length35
Mean length33.99284863
Min length5

Characters and Unicode

Total characters399280
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.3%

Sample

1st row13286
2nd row28400
3rd row28400
4th row28400
5th row28400

Common Values

ValueCountFrequency (%)
Not Applicable: Standalone Property11324
96.4%
361267857
 
0.5%
361639933
 
0.3%
498585820
 
0.2%
444282313
 
0.1%
444004713
 
0.1%
361473712
 
0.1%
581079412
 
0.1%
50493779
 
0.1%
57953488
 
0.1%
Other values (92)245
 
2.1%

Length

2021-07-25T00:20:33.932821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not11324
24.8%
applicable11324
24.8%
standalone11324
24.8%
property11324
24.8%
361267857
 
0.1%
361639933
 
0.1%
498585820
 
< 0.1%
444282313
 
< 0.1%
444004713
 
< 0.1%
581079412
 
< 0.1%
Other values (95)274
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o33972
 
8.5%
t33972
 
8.5%
33972
 
8.5%
p33972
 
8.5%
l33972
 
8.5%
a33972
 
8.5%
e33972
 
8.5%
n22648
 
5.7%
r22648
 
5.7%
N11324
 
2.8%
Other values (19)104856
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter305748
76.6%
Uppercase Letter45296
 
11.3%
Space Separator33972
 
8.5%
Other Punctuation11324
 
2.8%
Decimal Number2940
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o33972
11.1%
t33972
11.1%
p33972
11.1%
l33972
11.1%
a33972
11.1%
e33972
11.1%
n22648
7.4%
r22648
7.4%
i11324
 
3.7%
c11324
 
3.7%
Other values (3)33972
11.1%
Decimal Number
ValueCountFrequency (%)
4415
14.1%
6343
11.7%
8327
11.1%
9325
11.1%
3322
11.0%
7303
10.3%
1255
8.7%
2243
8.3%
5240
8.2%
0167
5.7%
Uppercase Letter
ValueCountFrequency (%)
N11324
25.0%
A11324
25.0%
S11324
25.0%
P11324
25.0%
Space Separator
ValueCountFrequency (%)
33972
100.0%
Other Punctuation
ValueCountFrequency (%)
:11324
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin351044
87.9%
Common48236
 
12.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o33972
9.7%
t33972
9.7%
p33972
9.7%
l33972
9.7%
a33972
9.7%
e33972
9.7%
n22648
 
6.5%
r22648
 
6.5%
N11324
 
3.2%
A11324
 
3.2%
Other values (7)79268
22.6%
Common
ValueCountFrequency (%)
33972
70.4%
:11324
 
23.5%
4415
 
0.9%
6343
 
0.7%
8327
 
0.7%
9325
 
0.7%
3322
 
0.7%
7303
 
0.6%
1255
 
0.5%
2243
 
0.5%
Other values (2)407
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII399280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o33972
 
8.5%
t33972
 
8.5%
33972
 
8.5%
p33972
 
8.5%
l33972
 
8.5%
a33972
 
8.5%
e33972
 
8.5%
n22648
 
5.7%
r22648
 
5.7%
N11324
 
2.8%
Other values (19)104856
26.3%

Parent Property Name
Categorical

HIGH CARDINALITY

Distinct103
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Applicable: Standalone Property
11324 
Columbia University (morningside)
 
57
New York University: Washington Square
 
33
Original Campus
 
20
Second Housing Company Inc
 
13
Other values (98)
 
299

Length

Max length69
Median length35
Mean length34.71105057
Min length4

Characters and Unicode

Total characters407716
Distinct characters69
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.3%

Sample

1st row201/205
2nd rowNYP Columbia (West Campus)
3rd rowNYP Columbia (West Campus)
4th rowNYP Columbia (West Campus)
5th rowNYP Columbia (West Campus)

Common Values

ValueCountFrequency (%)
Not Applicable: Standalone Property11324
96.4%
Columbia University (morningside)57
 
0.5%
New York University: Washington Square33
 
0.3%
Original Campus20
 
0.2%
Second Housing Company Inc13
 
0.1%
Third Housing Company Inc.13
 
0.1%
test12
 
0.1%
Columbia University Medical Center12
 
0.1%
Carriage: Fort Greene Partnership Homes9
 
0.1%
First Housing Company Inc.8
 
0.1%
Other values (93)245
 
2.1%

Length

2021-07-25T00:20:34.411540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
property11328
24.1%
standalone11324
24.1%
applicable11324
24.1%
not11324
24.1%
university109
 
0.2%
columbia73
 
0.2%
morningside57
 
0.1%
inc54
 
0.1%
campus51
 
0.1%
housing47
 
0.1%
Other values (223)1267
 
2.7%

Most occurring characters

ValueCountFrequency (%)
35212
 
8.6%
e34735
 
8.5%
a34578
 
8.5%
o34506
 
8.5%
t34432
 
8.4%
l34253
 
8.4%
p34131
 
8.4%
n23367
 
5.7%
r23256
 
5.7%
i12046
 
3.0%
Other values (59)107200
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter312987
76.8%
Uppercase Letter47286
 
11.6%
Space Separator35212
 
8.6%
Other Punctuation11481
 
2.8%
Decimal Number486
 
0.1%
Open Punctuation92
 
< 0.1%
Close Punctuation92
 
< 0.1%
Dash Punctuation80
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e34735
11.1%
a34578
11.0%
o34506
11.0%
t34432
11.0%
l34253
10.9%
p34131
10.9%
n23367
7.5%
r23256
7.4%
i12046
 
3.8%
y11523
 
3.7%
Other values (16)36160
11.6%
Uppercase Letter
ValueCountFrequency (%)
S11502
24.3%
N11433
24.2%
P11422
24.2%
A11406
24.1%
C313
 
0.7%
H128
 
0.3%
W126
 
0.3%
U126
 
0.3%
I123
 
0.3%
L82
 
0.2%
Other values (14)625
 
1.3%
Decimal Number
ValueCountFrequency (%)
0104
21.4%
193
19.1%
459
12.1%
354
11.1%
646
9.5%
245
9.3%
724
 
4.9%
924
 
4.9%
522
 
4.5%
815
 
3.1%
Other Punctuation
ValueCountFrequency (%)
:11377
99.1%
.46
 
0.4%
,30
 
0.3%
&15
 
0.1%
/13
 
0.1%
Space Separator
ValueCountFrequency (%)
35212
100.0%
Open Punctuation
ValueCountFrequency (%)
(92
100.0%
Close Punctuation
ValueCountFrequency (%)
)92
100.0%
Dash Punctuation
ValueCountFrequency (%)
-80
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin360273
88.4%
Common47443
 
11.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e34735
9.6%
a34578
9.6%
o34506
9.6%
t34432
9.6%
l34253
9.5%
p34131
9.5%
n23367
 
6.5%
r23256
 
6.5%
i12046
 
3.3%
y11523
 
3.2%
Other values (40)83446
23.2%
Common
ValueCountFrequency (%)
35212
74.2%
:11377
 
24.0%
0104
 
0.2%
193
 
0.2%
(92
 
0.2%
)92
 
0.2%
-80
 
0.2%
459
 
0.1%
354
 
0.1%
.46
 
0.1%
Other values (9)234
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII407716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35212
 
8.6%
e34735
 
8.5%
a34578
 
8.5%
o34506
 
8.5%
t34432
 
8.4%
l34253
 
8.4%
p34131
 
8.4%
n23367
 
5.7%
r23256
 
5.7%
i12046
 
3.0%
Other values (59)107200
26.3%

BBL - 10 digits
Categorical

HIGH CARDINALITY
UNIFORM

Distinct11580
Distinct (%)98.7%
Missing11
Missing (%)0.1%
Memory size91.9 KiB
1019730001
 
26
4067920600
 
13
4067900001
 
13
1018860001
 
10
3019200001
 
8
Other values (11575)
11665 

Length

Max length120
Median length10
Mean length10.05419685
Min length10

Characters and Unicode

Total characters117986
Distinct characters16
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11535 ?
Unique (%)98.3%

Sample

1st row1013160001
2nd row1021380040
3rd row1021380030
4th row1021390001
5th row1021390085

Common Values

ValueCountFrequency (%)
101973000126
 
0.2%
406792060013
 
0.1%
406790000113
 
0.1%
101886000110
 
0.1%
30192000018
 
0.1%
40678900158
 
0.1%
10203500017
 
0.1%
40679100016
 
0.1%
40848900015
 
< 0.1%
30195875015
 
< 0.1%
Other values (11570)11634
99.0%
(Missing)11
 
0.1%

Length

2021-07-25T00:20:34.955089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
101973000126
 
0.2%
406790000113
 
0.1%
406792060013
 
0.1%
101886000110
 
0.1%
40678900158
 
0.1%
30192000018
 
0.1%
10203500017
 
0.1%
40679100016
 
0.1%
10196100015
 
< 0.1%
30195875015
 
< 0.1%
Other values (11564)11646
99.1%

Most occurring characters

ValueCountFrequency (%)
042850
36.3%
118469
15.7%
210552
 
8.9%
39350
 
7.9%
57918
 
6.7%
47864
 
6.7%
76539
 
5.5%
65135
 
4.4%
84940
 
4.2%
94286
 
3.6%
Other values (6)83
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number117903
99.9%
Other Punctuation57
 
< 0.1%
Space Separator12
 
< 0.1%
Dash Punctuation10
 
< 0.1%
Format4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
042850
36.3%
118469
15.7%
210552
 
8.9%
39350
 
7.9%
57918
 
6.7%
47864
 
6.7%
76539
 
5.5%
65135
 
4.4%
84940
 
4.2%
94286
 
3.6%
Other Punctuation
ValueCountFrequency (%)
;54
94.7%
,2
 
3.5%
:1
 
1.8%
Space Separator
ValueCountFrequency (%)
12
100.0%
Dash Punctuation
ValueCountFrequency (%)
-10
100.0%
Format
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common117986
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
042850
36.3%
118469
15.7%
210552
 
8.9%
39350
 
7.9%
57918
 
6.7%
47864
 
6.7%
76539
 
5.5%
65135
 
4.4%
84940
 
4.2%
94286
 
3.6%
Other values (6)83
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII117982
> 99.9%
Punctuation4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
042850
36.3%
118469
15.7%
210552
 
8.9%
39350
 
7.9%
57918
 
6.7%
47864
 
6.7%
76539
 
5.5%
65135
 
4.4%
84940
 
4.2%
94286
 
3.6%
Other values (5)79
 
0.1%
Punctuation
ValueCountFrequency (%)
4
100.0%

NYC Borough, Block and Lot (BBL) self-reported
Categorical

HIGH CARDINALITY
UNIFORM

Distinct11582
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
1019730001
 
26
4-06792-0600
 
13
4-06790-0001
 
13
Not Available
 
11
1018860001
 
10
Other values (11577)
11673 

Length

Max length120
Median length12
Mean length11.48765537
Min length10

Characters and Unicode

Total characters134934
Distinct characters28
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11537 ?
Unique (%)98.2%

Sample

1st row1013160001
2nd row1-02138-0040
3rd row1-02138-0030
4th row1-02139-0001
5th row1-02139-0085

Common Values

ValueCountFrequency (%)
101973000126
 
0.2%
4-06792-060013
 
0.1%
4-06790-000113
 
0.1%
Not Available11
 
0.1%
101886000110
 
0.1%
3-01920-00018
 
0.1%
4-06789-00158
 
0.1%
10203500017
 
0.1%
4-06791-00016
 
0.1%
20287600555
 
< 0.1%
Other values (11572)11639
99.1%

Length

2021-07-25T00:20:35.457742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
101973000126
 
0.2%
4-06790-000113
 
0.1%
4-06792-060013
 
0.1%
not11
 
0.1%
available11
 
0.1%
101886000110
 
0.1%
3-01920-00018
 
0.1%
4-06789-00158
 
0.1%
10203500017
 
0.1%
4-06791-00016
 
0.1%
Other values (11568)11656
99.0%

Most occurring characters

ValueCountFrequency (%)
042851
31.8%
118470
13.7%
-16772
 
12.4%
210553
 
7.8%
39351
 
6.9%
57918
 
5.9%
47864
 
5.8%
76539
 
4.8%
65135
 
3.8%
84941
 
3.7%
Other values (18)4540
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number117908
87.4%
Dash Punctuation16772
 
12.4%
Lowercase Letter110
 
0.1%
Other Punctuation58
 
< 0.1%
Space Separator51
 
< 0.1%
Uppercase Letter22
 
< 0.1%
Control9
 
< 0.1%
Format4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
042851
36.3%
118470
15.7%
210553
 
9.0%
39351
 
7.9%
57918
 
6.7%
47864
 
6.7%
76539
 
5.5%
65135
 
4.4%
84941
 
4.2%
94286
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
a22
20.0%
l22
20.0%
o11
10.0%
t11
10.0%
v11
10.0%
i11
10.0%
b11
10.0%
e11
10.0%
Other Punctuation
ValueCountFrequency (%)
;54
93.1%
,2
 
3.4%
:1
 
1.7%
.1
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
N11
50.0%
A11
50.0%
Dash Punctuation
ValueCountFrequency (%)
-16772
100.0%
Space Separator
ValueCountFrequency (%)
51
100.0%
Format
ValueCountFrequency (%)
4
100.0%
Control
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common134802
99.9%
Latin132
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
042851
31.8%
118470
13.7%
-16772
 
12.4%
210553
 
7.8%
39351
 
6.9%
57918
 
5.9%
47864
 
5.8%
76539
 
4.9%
65135
 
3.8%
84941
 
3.7%
Other values (8)4408
 
3.3%
Latin
ValueCountFrequency (%)
a22
16.7%
l22
16.7%
N11
8.3%
o11
8.3%
t11
8.3%
A11
8.3%
v11
8.3%
i11
8.3%
b11
8.3%
e11
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII134930
> 99.9%
Punctuation4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
042851
31.8%
118470
13.7%
-16772
 
12.4%
210553
 
7.8%
39351
 
6.9%
57918
 
5.9%
47864
 
5.8%
76539
 
4.8%
65135
 
3.8%
84941
 
3.7%
Other values (17)4536
 
3.4%
Punctuation
ValueCountFrequency (%)
4
100.0%

NYC Building Identification Number (BIN)
Categorical

HIGH CARDINALITY

Distinct11508
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
 
162
4455438
 
13
4455379
 
13
4451548
 
8
4451568
 
6
Other values (11503)
11544 

Length

Max length1759
Median length7
Mean length12.02971224
Min length6

Characters and Unicode

Total characters141301
Distinct characters32
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11469 ?
Unique (%)97.6%

Sample

1st row1037549
2nd row1084198; 1084387;1084385; 1084386; 1084388; 1084389; 1807867; 1809824
3rd row1063380
4th row1087281; 1076746
5th row1063403

Common Values

ValueCountFrequency (%)
Not Available162
 
1.4%
445543813
 
0.1%
445537913
 
0.1%
44515488
 
0.1%
44515686
 
0.1%
20088165
 
< 0.1%
44521534
 
< 0.1%
44401843
 
< 0.1%
10844753
 
< 0.1%
10844682
 
< 0.1%
Other values (11498)11527
98.1%

Length

2021-07-25T00:20:35.966384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available162
 
1.3%
not162
 
1.3%
445537913
 
0.1%
445543813
 
0.1%
11
 
0.1%
44515488
 
0.1%
44515686
 
< 0.1%
33207365
 
< 0.1%
32571045
 
< 0.1%
20088165
 
< 0.1%
Other values (11855)11934
96.8%

Most occurring characters

ValueCountFrequency (%)
421315
15.1%
020118
14.2%
118055
12.8%
313479
9.5%
512005
8.5%
211428
8.1%
89363
6.6%
68906
6.3%
78150
 
5.8%
98041
 
5.7%
Other values (22)10441
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130860
92.6%
Other Punctuation7045
 
5.0%
Lowercase Letter1628
 
1.2%
Space Separator1387
 
1.0%
Uppercase Letter324
 
0.2%
Control43
 
< 0.1%
Dash Punctuation14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l326
20.0%
a324
19.9%
t163
10.0%
i163
10.0%
e163
10.0%
o162
10.0%
v162
10.0%
b162
10.0%
m1
 
0.1%
u1
 
0.1%
Decimal Number
ValueCountFrequency (%)
421315
16.3%
020118
15.4%
118055
13.8%
313479
10.3%
512005
9.2%
211428
8.7%
89363
7.2%
68906
6.8%
78150
 
6.2%
98041
 
6.1%
Other Punctuation
ValueCountFrequency (%)
;6929
98.4%
,92
 
1.3%
:22
 
0.3%
/2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1290
93.0%
 97
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
N162
50.0%
A162
50.0%
Control
ValueCountFrequency (%)
32
74.4%
11
 
25.6%
Dash Punctuation
ValueCountFrequency (%)
-14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common139349
98.6%
Latin1952
 
1.4%

Most frequent character per script

Common
ValueCountFrequency (%)
421315
15.3%
020118
14.4%
118055
13.0%
313479
9.7%
512005
8.6%
211428
8.2%
89363
6.7%
68906
6.4%
78150
 
5.8%
98041
 
5.8%
Other values (9)8489
 
6.1%
Latin
ValueCountFrequency (%)
l326
16.7%
a324
16.6%
t163
8.4%
i163
8.4%
e163
8.4%
N162
8.3%
o162
8.3%
A162
8.3%
v162
8.3%
b162
8.3%
Other values (3)3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII141204
99.9%
Latin 1 Sup97
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
421315
15.1%
020118
14.2%
118055
12.8%
313479
9.5%
512005
8.5%
211428
8.1%
89363
6.6%
68906
6.3%
78150
 
5.8%
98041
 
5.7%
Other values (21)10344
7.3%
Latin 1 Sup
ValueCountFrequency (%)
 97
100.0%

Address 1 (self-reported)
Categorical

HIGH CARDINALITY
UNIFORM

Distinct11645
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
410 West 118th Street
 
12
P.O. BOX 300513
 
7
210 West 150th Street
 
7
387 ADELPHI STREET,
 
5
828 Midwood Street
 
4
Other values (11640)
11711 

Length

Max length71
Median length18
Mean length18.36225098
Min length5

Characters and Unicode

Total characters215683
Distinct characters77
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11587 ?
Unique (%)98.6%

Sample

1st row201/205 East 42nd st.
2nd row622 168th Street
3rd row3975 Broadway
4th row161 Fort Washington Ave
5th row710 West 168th Street

Common Values

ValueCountFrequency (%)
410 West 118th Street12
 
0.1%
P.O. BOX 3005137
 
0.1%
210 West 150th Street7
 
0.1%
387 ADELPHI STREET, 5
 
< 0.1%
828 Midwood Street4
 
< 0.1%
2000 East Tremont Ave4
 
< 0.1%
226-02 Manor Road4
 
< 0.1%
321 East 16th St4
 
< 0.1%
2800 West 5th Street3
 
< 0.1%
1000 10th AVE3
 
< 0.1%
Other values (11635)11693
99.5%

Length

2021-07-25T00:20:36.506970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
street4150
 
10.3%
avenue2650
 
6.6%
west1603
 
4.0%
ave1495
 
3.7%
east1411
 
3.5%
st1069
 
2.6%
park390
 
1.0%
broadway309
 
0.8%
e295
 
0.7%
road289
 
0.7%
Other values (6072)26691
66.1%

Most occurring characters

ValueCountFrequency (%)
28941
 
13.4%
e21312
 
9.9%
t17962
 
8.3%
19446
 
4.4%
r9421
 
4.4%
n7098
 
3.3%
a7005
 
3.2%
06907
 
3.2%
26224
 
2.9%
56089
 
2.8%
Other values (67)95278
44.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter102718
47.6%
Decimal Number51755
24.0%
Space Separator28953
 
13.4%
Uppercase Letter28231
 
13.1%
Dash Punctuation2636
 
1.2%
Other Punctuation1348
 
0.6%
Open Punctuation21
 
< 0.1%
Close Punctuation20
 
< 0.1%
Control1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S6013
21.3%
A4757
16.9%
E3112
11.0%
W2390
 
8.5%
B1353
 
4.8%
P1307
 
4.6%
R1132
 
4.0%
T1093
 
3.9%
N855
 
3.0%
C818
 
2.9%
Other values (16)5401
19.1%
Lowercase Letter
ValueCountFrequency (%)
e21312
20.7%
t17962
17.5%
r9421
9.2%
n7098
 
6.9%
a7005
 
6.8%
s5504
 
5.4%
v4934
 
4.8%
h4615
 
4.5%
o4293
 
4.2%
u3777
 
3.7%
Other values (16)16797
16.4%
Decimal Number
ValueCountFrequency (%)
19446
18.3%
06907
13.3%
26224
12.0%
56089
11.8%
35273
10.2%
44626
8.9%
63735
 
7.2%
73458
 
6.7%
83225
 
6.2%
92772
 
5.4%
Other Punctuation
ValueCountFrequency (%)
.751
55.7%
,421
31.2%
/114
 
8.5%
;26
 
1.9%
&26
 
1.9%
'7
 
0.5%
:2
 
0.1%
#1
 
0.1%
Space Separator
ValueCountFrequency (%)
28941
> 99.9%
 12
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-2634
99.9%
2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
(21
100.0%
Close Punctuation
ValueCountFrequency (%)
)20
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin130949
60.7%
Common84734
39.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e21312
16.3%
t17962
13.7%
r9421
 
7.2%
n7098
 
5.4%
a7005
 
5.3%
S6013
 
4.6%
s5504
 
4.2%
v4934
 
3.8%
A4757
 
3.6%
h4615
 
3.5%
Other values (42)42328
32.3%
Common
ValueCountFrequency (%)
28941
34.2%
19446
 
11.1%
06907
 
8.2%
26224
 
7.3%
56089
 
7.2%
35273
 
6.2%
44626
 
5.5%
63735
 
4.4%
73458
 
4.1%
83225
 
3.8%
Other values (15)6810
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII215669
> 99.9%
Latin 1 Sup12
 
< 0.1%
Punctuation2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
28941
 
13.4%
e21312
 
9.9%
t17962
 
8.3%
19446
 
4.4%
r9421
 
4.4%
n7098
 
3.3%
a7005
 
3.2%
06907
 
3.2%
26224
 
2.9%
56089
 
2.8%
Other values (65)95264
44.2%
Latin 1 Sup
ValueCountFrequency (%)
 12
100.0%
Punctuation
ValueCountFrequency (%)
2
100.0%

Address 2
Categorical

HIGH CARDINALITY

Distinct177
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
11539 
Default Info
 
14
Multiple Addresses-See Unique Bldg ID
 
9
B-230
 
8
Bronx
 
2
Other values (172)
 
174

Length

Max length62
Median length13
Mean length13.10063
Min length2

Characters and Unicode

Total characters153880
Distinct characters67
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique170 ?
Unique (%)1.4%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th row177 Fort Washington Ave
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available11539
98.2%
Default Info14
 
0.1%
Multiple Addresses-See Unique Bldg ID9
 
0.1%
B-2308
 
0.1%
Bronx2
 
< 0.1%
1650 BROADWAY SUITE 9102
 
< 0.1%
aka 10 Clinton St2
 
< 0.1%
(2031 Fred Douglass Blvd; 304 W 111 St)1
 
< 0.1%
Ruxton1
 
< 0.1%
139-76,80 85th Drive1
 
< 0.1%
Other values (167)167
 
1.4%

Length

2021-07-25T00:20:37.043502image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available11539
48.6%
not11539
48.6%
street40
 
0.2%
avenue26
 
0.1%
aka19
 
0.1%
st15
 
0.1%
default14
 
0.1%
broadway14
 
0.1%
info14
 
0.1%
west14
 
0.1%
Other values (337)525
 
2.2%

Most occurring characters

ValueCountFrequency (%)
a23265
15.1%
l23171
15.1%
12014
7.8%
e11877
7.7%
t11795
7.7%
o11647
7.6%
i11615
7.5%
A11605
7.5%
v11591
7.5%
N11551
7.5%
Other values (57)13749
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter117383
76.3%
Uppercase Letter23589
 
15.3%
Space Separator12014
 
7.8%
Decimal Number750
 
0.5%
Dash Punctuation70
 
< 0.1%
Other Punctuation58
 
< 0.1%
Open Punctuation8
 
< 0.1%
Close Punctuation8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a23265
19.8%
l23171
19.7%
e11877
10.1%
t11795
10.0%
o11647
9.9%
i11615
9.9%
v11591
9.9%
b11549
9.8%
r168
 
0.1%
n132
 
0.1%
Other values (14)573
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
A11605
49.2%
N11551
49.0%
S88
 
0.4%
B47
 
0.2%
D38
 
0.2%
E28
 
0.1%
I28
 
0.1%
W27
 
0.1%
C24
 
0.1%
P23
 
0.1%
Other values (12)130
 
0.6%
Decimal Number
ValueCountFrequency (%)
1155
20.7%
0115
15.3%
289
11.9%
379
10.5%
565
8.7%
457
 
7.6%
653
 
7.1%
852
 
6.9%
947
 
6.3%
738
 
5.1%
Other Punctuation
ValueCountFrequency (%)
.22
37.9%
,16
27.6%
/8
 
13.8%
&4
 
6.9%
#3
 
5.2%
:3
 
5.2%
;2
 
3.4%
Space Separator
ValueCountFrequency (%)
12014
100.0%
Dash Punctuation
ValueCountFrequency (%)
-70
100.0%
Open Punctuation
ValueCountFrequency (%)
(8
100.0%
Close Punctuation
ValueCountFrequency (%)
)8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin140972
91.6%
Common12908
 
8.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a23265
16.5%
l23171
16.4%
e11877
8.4%
t11795
8.4%
o11647
8.3%
i11615
8.2%
A11605
8.2%
v11591
8.2%
N11551
8.2%
b11549
8.2%
Other values (36)1306
 
0.9%
Common
ValueCountFrequency (%)
12014
93.1%
1155
 
1.2%
0115
 
0.9%
289
 
0.7%
379
 
0.6%
-70
 
0.5%
565
 
0.5%
457
 
0.4%
653
 
0.4%
852
 
0.4%
Other values (11)159
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII153880
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a23265
15.1%
l23171
15.1%
12014
7.8%
e11877
7.7%
t11795
7.7%
o11647
7.6%
i11615
7.5%
A11605
7.5%
v11591
7.5%
N11551
7.5%
Other values (57)13749
8.9%

Postal Code
Categorical

HIGH CARDINALITY

Distinct286
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
10022
 
269
10016
 
268
10025
 
243
10001
 
239
10024
 
227
Other values (281)
10500 

Length

Max length10
Median length5
Mean length5.017623021
Min length4

Characters and Unicode

Total characters58937
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)0.8%

Sample

1st row10017
2nd row10032
3rd row10032
4th row10032
5th row10032

Common Values

ValueCountFrequency (%)
10022269
 
2.3%
10016268
 
2.3%
10025243
 
2.1%
10001239
 
2.0%
10024227
 
1.9%
10019219
 
1.9%
10011218
 
1.9%
10463216
 
1.8%
10021210
 
1.8%
10023209
 
1.8%
Other values (276)9428
80.3%

Length

2021-07-25T00:20:37.577076image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10022269
 
2.3%
10016268
 
2.3%
10025243
 
2.1%
10001239
 
2.0%
10024227
 
1.9%
10019219
 
1.9%
10011218
 
1.9%
10463216
 
1.8%
10021210
 
1.8%
10023209
 
1.8%
Other values (276)9428
80.3%

Most occurring characters

ValueCountFrequency (%)
120449
34.7%
014752
25.0%
26191
 
10.5%
34488
 
7.6%
43614
 
6.1%
52839
 
4.8%
62409
 
4.1%
72038
 
3.5%
81285
 
2.2%
9864
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number58929
> 99.9%
Dash Punctuation8
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
120449
34.7%
014752
25.0%
26191
 
10.5%
34488
 
7.6%
43614
 
6.1%
52839
 
4.8%
62409
 
4.1%
72038
 
3.5%
81285
 
2.2%
9864
 
1.5%
Dash Punctuation
ValueCountFrequency (%)
-8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common58937
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
120449
34.7%
014752
25.0%
26191
 
10.5%
34488
 
7.6%
43614
 
6.1%
52839
 
4.8%
62409
 
4.1%
72038
 
3.5%
81285
 
2.2%
9864
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII58937
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
120449
34.7%
014752
25.0%
26191
 
10.5%
34488
 
7.6%
43614
 
6.1%
52839
 
4.8%
62409
 
4.1%
72038
 
3.5%
81285
 
2.2%
9864
 
1.5%

Street Number
Categorical

HIGH CARDINALITY
MISSING

Distinct4198
Distinct (%)36.1%
Missing124
Missing (%)1.1%
Memory size91.9 KiB
1
 
66
200
 
54
2
 
51
100
 
49
10
 
47
Other values (4193)
11355 

Length

Max length9
Median length3
Mean length3.504302186
Min length1

Characters and Unicode

Total characters40727
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2471 ?
Unique (%)21.3%

Sample

1st row675
2nd row180
3rd row3975
4th row161
5th row193

Common Values

ValueCountFrequency (%)
166
 
0.6%
20054
 
0.5%
251
 
0.4%
10049
 
0.4%
1047
 
0.4%
4046
 
0.4%
5045
 
0.4%
15045
 
0.4%
53045
 
0.4%
22545
 
0.4%
Other values (4188)11129
94.7%
(Missing)124
 
1.1%

Length

2021-07-25T00:20:38.070755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
166
 
0.6%
20054
 
0.5%
251
 
0.4%
10049
 
0.4%
1047
 
0.4%
4046
 
0.4%
22545
 
0.4%
53045
 
0.4%
5045
 
0.4%
15045
 
0.4%
Other values (4188)11129
95.8%

Most occurring characters

ValueCountFrequency (%)
17177
17.6%
05649
13.9%
24863
11.9%
54576
11.2%
33961
9.7%
43434
8.4%
62607
 
6.4%
72246
 
5.5%
82208
 
5.4%
-2046
 
5.0%
Other values (6)1960
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38662
94.9%
Dash Punctuation2046
 
5.0%
Uppercase Letter18
 
< 0.1%
Other Punctuation1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
17177
18.6%
05649
14.6%
24863
12.6%
54576
11.8%
33961
10.2%
43434
8.9%
62607
 
6.7%
72246
 
5.8%
82208
 
5.7%
91941
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
A13
72.2%
O3
 
16.7%
R1
 
5.6%
D1
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
-2046
100.0%
Other Punctuation
ValueCountFrequency (%)
/1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common40709
> 99.9%
Latin18
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
17177
17.6%
05649
13.9%
24863
11.9%
54576
11.2%
33961
9.7%
43434
8.4%
62607
 
6.4%
72246
 
5.5%
82208
 
5.4%
-2046
 
5.0%
Other values (2)1942
 
4.8%
Latin
ValueCountFrequency (%)
A13
72.2%
O3
 
16.7%
R1
 
5.6%
D1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII40727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17177
17.6%
05649
13.9%
24863
11.9%
54576
11.2%
33961
9.7%
43434
8.4%
62607
 
6.4%
72246
 
5.5%
82208
 
5.4%
-2046
 
5.0%
Other values (6)1960
 
4.8%

Street Name
Categorical

HIGH CARDINALITY
MISSING

Distinct2024
Distinct (%)17.4%
Missing122
Missing (%)1.0%
Memory size91.9 KiB
BROADWAY
 
390
5 AVENUE
 
225
PARK AVENUE
 
189
3 AVENUE
 
146
MADISON AVENUE
 
134
Other values (2019)
10540 

Length

Max length20
Median length14
Mean length13.15941156
Min length6

Characters and Unicode

Total characters152965
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique662 ?
Unique (%)5.7%

Sample

1st row3 AVENUE
2nd rowFT WASHINGTON AVENUE
3rd rowBROADWAY
4th rowFT WASHINGTON AVENUE
5th rowFT WASHINGTON AVENUE

Common Values

ValueCountFrequency (%)
BROADWAY390
 
3.3%
5 AVENUE225
 
1.9%
PARK AVENUE189
 
1.6%
3 AVENUE146
 
1.2%
MADISON AVENUE134
 
1.1%
OCEAN AVENUE129
 
1.1%
RIVERSIDE DRIVE123
 
1.0%
GRAND CONCOURSE120
 
1.0%
WEST END AVENUE119
 
1.0%
OCEAN PARKWAY108
 
0.9%
Other values (2014)9941
84.6%
(Missing)122
 
1.0%

Length

2021-07-25T00:20:38.631256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
street4579
 
17.0%
avenue4498
 
16.7%
west1616
 
6.0%
east1431
 
5.3%
broadway404
 
1.5%
park401
 
1.5%
boulevard350
 
1.3%
road343
 
1.3%
place268
 
1.0%
parkway253
 
0.9%
Other values (1313)12773
47.5%

Most occurring characters

ValueCountFrequency (%)
E27180
17.8%
15292
10.0%
T15114
9.9%
A12534
 
8.2%
R10591
 
6.9%
S10530
 
6.9%
N9142
 
6.0%
U6111
 
4.0%
V5696
 
3.7%
O4989
 
3.3%
Other values (28)35786
23.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter128113
83.8%
Space Separator15292
 
10.0%
Decimal Number9558
 
6.2%
Dash Punctuation2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E27180
21.2%
T15114
11.8%
A12534
9.8%
R10591
 
8.3%
S10530
 
8.2%
N9142
 
7.1%
U6111
 
4.8%
V5696
 
4.4%
O4989
 
3.9%
D3317
 
2.6%
Other values (16)22909
17.9%
Decimal Number
ValueCountFrequency (%)
11650
17.3%
31109
11.6%
21062
11.1%
51060
11.1%
7929
9.7%
4907
9.5%
8900
9.4%
6804
8.4%
9627
 
6.6%
0510
 
5.3%
Space Separator
ValueCountFrequency (%)
15292
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin128113
83.8%
Common24852
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
E27180
21.2%
T15114
11.8%
A12534
9.8%
R10591
 
8.3%
S10530
 
8.2%
N9142
 
7.1%
U6111
 
4.8%
V5696
 
4.4%
O4989
 
3.9%
D3317
 
2.6%
Other values (16)22909
17.9%
Common
ValueCountFrequency (%)
15292
61.5%
11650
 
6.6%
31109
 
4.5%
21062
 
4.3%
51060
 
4.3%
7929
 
3.7%
4907
 
3.6%
8900
 
3.6%
6804
 
3.2%
9627
 
2.5%
Other values (2)512
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII152965
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E27180
17.8%
15292
10.0%
T15114
9.9%
A12534
 
8.2%
R10591
 
6.9%
S10530
 
6.9%
N9142
 
6.0%
U6111
 
4.0%
V5696
 
3.7%
O4989
 
3.3%
Other values (28)35786
23.4%

Borough
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing118
Missing (%)1.0%
Memory size91.9 KiB
Manhattan
5176 
Brooklyn
2265 
Queens
2091 
Bronx
1937 
Staten Island
 
159

Length

Max length13
Median length8
Mean length7.654110767
Min length5

Characters and Unicode

Total characters89002
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowManhattan
2nd rowManhattan
3rd rowManhattan
4th rowManhattan
5th rowManhattan

Common Values

ValueCountFrequency (%)
Manhattan5176
44.1%
Brooklyn2265
19.3%
Queens2091
17.8%
Bronx1937
 
16.5%
Staten Island159
 
1.4%
(Missing)118
 
1.0%

Length

2021-07-25T00:20:39.089033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-25T00:20:39.239664image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
manhattan5176
43.9%
brooklyn2265
19.2%
queens2091
17.7%
bronx1937
 
16.4%
island159
 
1.3%
staten159
 
1.3%

Most occurring characters

ValueCountFrequency (%)
n16963
19.1%
a15846
17.8%
t10670
12.0%
o6467
 
7.3%
M5176
 
5.8%
h5176
 
5.8%
e4341
 
4.9%
B4202
 
4.7%
r4202
 
4.7%
l2424
 
2.7%
Other values (10)13535
15.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter77056
86.6%
Uppercase Letter11787
 
13.2%
Space Separator159
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n16963
22.0%
a15846
20.6%
t10670
13.8%
o6467
 
8.4%
h5176
 
6.7%
e4341
 
5.6%
r4202
 
5.5%
l2424
 
3.1%
k2265
 
2.9%
y2265
 
2.9%
Other values (4)6437
 
8.4%
Uppercase Letter
ValueCountFrequency (%)
M5176
43.9%
B4202
35.6%
Q2091
17.7%
S159
 
1.3%
I159
 
1.3%
Space Separator
ValueCountFrequency (%)
159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin88843
99.8%
Common159
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n16963
19.1%
a15846
17.8%
t10670
12.0%
o6467
 
7.3%
M5176
 
5.8%
h5176
 
5.8%
e4341
 
4.9%
B4202
 
4.7%
r4202
 
4.7%
l2424
 
2.7%
Other values (9)13376
15.1%
Common
ValueCountFrequency (%)
159
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII89002
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n16963
19.1%
a15846
17.8%
t10670
12.0%
o6467
 
7.3%
M5176
 
5.8%
h5176
 
5.8%
e4341
 
4.9%
B4202
 
4.7%
r4202
 
4.7%
l2424
 
2.7%
Other values (10)13535
15.2%

DOF Gross Floor Area
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9296
Distinct (%)79.9%
Missing118
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean173269.4544
Minimum50028
Maximum13540113
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:40.553344image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum50028
5-th percentile52800
Q165240
median93138.5
Q3159614
95-th percentile531812.55
Maximum13540113
Range13490085
Interquartile range (IQR)94374

Descriptive statistics

Standard deviation336705.4546
Coefficient of variation (CV)1.943247618
Kurtosis418.0874686
Mean173269.4544
Median Absolute Deviation (MAD)33932
Skewness15.67954306
Sum2014777216
Variance1.133705632 × 1011
MonotonicityNot monotonic
2021-07-25T00:20:40.801677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6000050
 
0.4%
5400041
 
0.3%
5040033
 
0.3%
7200028
 
0.2%
136887026
 
0.2%
5100025
 
0.2%
6300024
 
0.2%
9000024
 
0.2%
5700023
 
0.2%
6600022
 
0.2%
Other values (9286)11332
96.5%
(Missing)118
 
1.0%
ValueCountFrequency (%)
500282
< 0.1%
500291
< 0.1%
500441
< 0.1%
500491
< 0.1%
500521
< 0.1%
500591
< 0.1%
500661
< 0.1%
500701
< 0.1%
500711
< 0.1%
500751
< 0.1%
ValueCountFrequency (%)
135401131
 
< 0.1%
89421761
 
< 0.1%
85124794
< 0.1%
69404501
 
< 0.1%
55410311
 
< 0.1%
39112542
 
< 0.1%
37505655
< 0.1%
36935392
 
< 0.1%
36780001
 
< 0.1%
31221651
 
< 0.1%

Primary Property Type - Self Selected
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct55
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Multifamily Housing
8688 
Office
1316 
Hotel
 
223
Other
 
185
Non-Refrigerated Warehouse
 
180
Other values (50)
1154 

Length

Max length48
Median length19
Mean length17.22399115
Min length5

Characters and Unicode

Total characters202313
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowOffice
2nd rowHospital (General Medical & Surgical)
3rd rowHospital (General Medical & Surgical)
4th rowHospital (General Medical & Surgical)
5th rowHospital (General Medical & Surgical)

Common Values

ValueCountFrequency (%)
Multifamily Housing8688
74.0%
Office1316
 
11.2%
Hotel223
 
1.9%
Other185
 
1.6%
Non-Refrigerated Warehouse180
 
1.5%
Residence Hall/Dormitory109
 
0.9%
College/University107
 
0.9%
Senior Care Community99
 
0.8%
Self-Storage Facility98
 
0.8%
Retail Store90
 
0.8%
Other values (45)651
 
5.5%

Length

2021-07-25T00:20:41.305373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
housing8688
39.4%
multifamily8688
39.4%
office1368
 
6.2%
other250
 
1.1%
hotel223
 
1.0%
warehouse192
 
0.9%
non-refrigerated180
 
0.8%
facility124
 
0.6%
112
 
0.5%
store111
 
0.5%
Other values (79)2106
 
9.6%

Most occurring characters

ValueCountFrequency (%)
i29505
14.6%
l19011
 
9.4%
u18071
 
8.9%
f11780
 
5.8%
t10857
 
5.4%
o10533
 
5.2%
a10355
 
5.1%
10296
 
5.1%
n9958
 
4.9%
s9487
 
4.7%
Other values (43)62460
30.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter168318
83.2%
Uppercase Letter22568
 
11.2%
Space Separator10296
 
5.1%
Dash Punctuation430
 
0.2%
Other Punctuation427
 
0.2%
Decimal Number172
 
0.1%
Open Punctuation51
 
< 0.1%
Close Punctuation51
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i29505
17.5%
l19011
11.3%
u18071
10.7%
f11780
 
7.0%
t10857
 
6.5%
o10533
 
6.3%
a10355
 
6.2%
n9958
 
5.9%
s9487
 
5.6%
g9281
 
5.5%
Other values (12)29480
17.5%
Uppercase Letter
ValueCountFrequency (%)
H9088
40.3%
M8939
39.6%
O1637
 
7.3%
S612
 
2.7%
R421
 
1.9%
C418
 
1.9%
W221
 
1.0%
P206
 
0.9%
D188
 
0.8%
U184
 
0.8%
Other values (11)654
 
2.9%
Other Punctuation
ValueCountFrequency (%)
/364
85.2%
&47
 
11.0%
,12
 
2.8%
.4
 
0.9%
Decimal Number
ValueCountFrequency (%)
186
50.0%
286
50.0%
Space Separator
ValueCountFrequency (%)
10296
100.0%
Open Punctuation
ValueCountFrequency (%)
(51
100.0%
Close Punctuation
ValueCountFrequency (%)
)51
100.0%
Dash Punctuation
ValueCountFrequency (%)
-430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin190886
94.4%
Common11427
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i29505
15.5%
l19011
 
10.0%
u18071
 
9.5%
f11780
 
6.2%
t10857
 
5.7%
o10533
 
5.5%
a10355
 
5.4%
n9958
 
5.2%
s9487
 
5.0%
g9281
 
4.9%
Other values (33)52048
27.3%
Common
ValueCountFrequency (%)
10296
90.1%
-430
 
3.8%
/364
 
3.2%
186
 
0.8%
286
 
0.8%
(51
 
0.4%
)51
 
0.4%
&47
 
0.4%
,12
 
0.1%
.4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII202313
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i29505
14.6%
l19011
 
9.4%
u18071
 
8.9%
f11780
 
5.8%
t10857
 
5.4%
o10533
 
5.2%
a10355
 
5.1%
10296
 
5.1%
n9958
 
4.9%
s9487
 
4.7%
Other values (43)62460
30.9%
Distinct813
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Multifamily Housing
6182 
Office
 
575
Multifamily Housing, Parking
 
442
Multifamily Housing, Retail Store
 
413
Medical Office, Multifamily Housing
 
206
Other values (808)
3928 

Length

Max length296
Median length19
Mean length24.32828197
Min length5

Characters and Unicode

Total characters285760
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique562 ?
Unique (%)4.8%

Sample

1st rowOffice
2nd rowHospital (General Medical & Surgical)
3rd rowHospital (General Medical & Surgical)
4th rowHospital (General Medical & Surgical)
5th rowHospital (General Medical & Surgical)

Common Values

ValueCountFrequency (%)
Multifamily Housing6182
52.6%
Office575
 
4.9%
Multifamily Housing, Parking442
 
3.8%
Multifamily Housing, Retail Store413
 
3.5%
Medical Office, Multifamily Housing206
 
1.8%
Office, Retail Store203
 
1.7%
Multifamily Housing, Other191
 
1.6%
Hotel172
 
1.5%
Multifamily Housing, Office165
 
1.4%
Non-Refrigerated Warehouse123
 
1.0%
Other values (803)3074
26.2%

Length

2021-07-25T00:20:41.856892image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
housing8746
27.3%
multifamily8746
27.3%
office2751
 
8.6%
store1648
 
5.1%
retail1443
 
4.5%
parking1215
 
3.8%
other1210
 
3.8%
medical598
 
1.9%
372
 
1.2%
restaurant327
 
1.0%
Other values (94)4984
15.6%

Most occurring characters

ValueCountFrequency (%)
i35651
 
12.5%
l21946
 
7.7%
20294
 
7.1%
u19088
 
6.7%
t17081
 
6.0%
a16098
 
5.6%
f14724
 
5.2%
e13448
 
4.7%
o13330
 
4.7%
n13063
 
4.6%
Other values (43)101037
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter224056
78.4%
Uppercase Letter32890
 
11.5%
Space Separator20294
 
7.1%
Other Punctuation7182
 
2.5%
Dash Punctuation874
 
0.3%
Decimal Number294
 
0.1%
Open Punctuation85
 
< 0.1%
Close Punctuation85
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i35651
15.9%
l21946
9.8%
u19088
 
8.5%
t17081
 
7.6%
a16098
 
7.2%
f14724
 
6.6%
e13448
 
6.0%
o13330
 
5.9%
n13063
 
5.8%
g10812
 
4.8%
Other values (12)48815
21.8%
Uppercase Letter
ValueCountFrequency (%)
M9505
28.9%
H9250
28.1%
O4154
12.6%
S2693
 
8.2%
R2333
 
7.1%
P1497
 
4.6%
C738
 
2.2%
F550
 
1.7%
W334
 
1.0%
D323
 
1.0%
Other values (11)1513
 
4.6%
Other Punctuation
ValueCountFrequency (%)
,6192
86.2%
/905
 
12.6%
&50
 
0.7%
.35
 
0.5%
Decimal Number
ValueCountFrequency (%)
1147
50.0%
2147
50.0%
Space Separator
ValueCountFrequency (%)
20294
100.0%
Open Punctuation
ValueCountFrequency (%)
(85
100.0%
Close Punctuation
ValueCountFrequency (%)
)85
100.0%
Dash Punctuation
ValueCountFrequency (%)
-874
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin256946
89.9%
Common28814
 
10.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i35651
13.9%
l21946
 
8.5%
u19088
 
7.4%
t17081
 
6.6%
a16098
 
6.3%
f14724
 
5.7%
e13448
 
5.2%
o13330
 
5.2%
n13063
 
5.1%
g10812
 
4.2%
Other values (33)81705
31.8%
Common
ValueCountFrequency (%)
20294
70.4%
,6192
 
21.5%
/905
 
3.1%
-874
 
3.0%
1147
 
0.5%
2147
 
0.5%
(85
 
0.3%
)85
 
0.3%
&50
 
0.2%
.35
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII285760
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i35651
 
12.5%
l21946
 
7.7%
20294
 
7.1%
u19088
 
6.7%
t17081
 
6.0%
a16098
 
5.6%
f14724
 
5.2%
e13448
 
4.7%
o13330
 
4.7%
n13063
 
4.6%
Other values (43)101037
35.4%

Largest Property Use Type
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct54
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Multifamily Housing
8694 
Office
1364 
Hotel
 
227
Non-Refrigerated Warehouse
 
202
Other
 
154
Other values (49)
1105 

Length

Max length48
Median length19
Mean length17.19828027
Min length5

Characters and Unicode

Total characters202011
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowOffice
2nd rowHospital (General Medical & Surgical)
3rd rowHospital (General Medical & Surgical)
4th rowHospital (General Medical & Surgical)
5th rowHospital (General Medical & Surgical)

Common Values

ValueCountFrequency (%)
Multifamily Housing8694
74.0%
Office1364
 
11.6%
Hotel227
 
1.9%
Non-Refrigerated Warehouse202
 
1.7%
Other154
 
1.3%
Residence Hall/Dormitory111
 
0.9%
K-12 School104
 
0.9%
Senior Care Community100
 
0.9%
Self-Storage Facility98
 
0.8%
Retail Store94
 
0.8%
Other values (44)598
 
5.1%

Length

2021-07-25T00:20:42.638800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
housing8694
39.6%
multifamily8694
39.6%
office1419
 
6.5%
other230
 
1.0%
hotel227
 
1.0%
warehouse211
 
1.0%
non-refrigerated202
 
0.9%
124
 
0.6%
facility122
 
0.6%
store120
 
0.5%
Other values (77)1932
 
8.8%

Most occurring characters

ValueCountFrequency (%)
i29512
14.6%
l19017
 
9.4%
u18088
 
9.0%
f11903
 
5.9%
t10784
 
5.3%
o10553
 
5.2%
a10413
 
5.2%
10229
 
5.1%
n9974
 
4.9%
s9422
 
4.7%
Other values (43)62116
30.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter168087
83.2%
Uppercase Letter22492
 
11.1%
Space Separator10229
 
5.1%
Dash Punctuation481
 
0.2%
Other Punctuation410
 
0.2%
Decimal Number208
 
0.1%
Open Punctuation52
 
< 0.1%
Close Punctuation52
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i29512
17.6%
l19017
11.3%
u18088
10.8%
f11903
7.1%
t10784
 
6.4%
o10553
 
6.3%
a10413
 
6.2%
n9974
 
5.9%
s9422
 
5.6%
g9300
 
5.5%
Other values (12)29121
17.3%
Uppercase Letter
ValueCountFrequency (%)
H9099
40.5%
M8876
39.5%
O1668
 
7.4%
S643
 
2.9%
R447
 
2.0%
C392
 
1.7%
W235
 
1.0%
N204
 
0.9%
D186
 
0.8%
P146
 
0.6%
Other values (11)596
 
2.6%
Other Punctuation
ValueCountFrequency (%)
/346
84.4%
&48
 
11.7%
,12
 
2.9%
.4
 
1.0%
Decimal Number
ValueCountFrequency (%)
1104
50.0%
2104
50.0%
Space Separator
ValueCountFrequency (%)
10229
100.0%
Open Punctuation
ValueCountFrequency (%)
(52
100.0%
Close Punctuation
ValueCountFrequency (%)
)52
100.0%
Dash Punctuation
ValueCountFrequency (%)
-481
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin190579
94.3%
Common11432
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i29512
15.5%
l19017
 
10.0%
u18088
 
9.5%
f11903
 
6.2%
t10784
 
5.7%
o10553
 
5.5%
a10413
 
5.5%
n9974
 
5.2%
s9422
 
4.9%
g9300
 
4.9%
Other values (33)51613
27.1%
Common
ValueCountFrequency (%)
10229
89.5%
-481
 
4.2%
/346
 
3.0%
1104
 
0.9%
2104
 
0.9%
(52
 
0.5%
)52
 
0.5%
&48
 
0.4%
,12
 
0.1%
.4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII202011
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i29512
14.6%
l19017
 
9.4%
u18088
 
9.0%
f11903
 
5.9%
t10784
 
5.3%
o10553
 
5.2%
a10413
 
5.2%
10229
 
5.1%
n9974
 
4.9%
s9422
 
4.7%
Other values (43)62116
30.7%
Distinct9484
Distinct (%)80.7%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
70000
 
61
60000
 
47
54000
 
32
80000
 
31
65000
 
31
Other values (9479)
11544 

Length

Max length13
Median length5
Mean length5.466967478
Min length2

Characters and Unicode

Total characters64215
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8621 ?
Unique (%)73.4%

Sample

1st row293447
2nd row3889181
3rd row231342
4th row1305748
5th row179694

Common Values

ValueCountFrequency (%)
7000061
 
0.5%
6000047
 
0.4%
5400032
 
0.3%
8000031
 
0.3%
6500031
 
0.3%
6300029
 
0.2%
5200029
 
0.2%
12000027
 
0.2%
6600026
 
0.2%
7500025
 
0.2%
Other values (9474)11408
97.1%

Length

2021-07-25T00:20:43.109543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
7000061
 
0.5%
6000047
 
0.4%
5400032
 
0.3%
8000031
 
0.3%
6500031
 
0.3%
5200029
 
0.2%
6300029
 
0.2%
12000027
 
0.2%
6600026
 
0.2%
7500025
 
0.2%
Other values (9475)11410
97.1%

Most occurring characters

ValueCountFrequency (%)
012788
19.9%
17813
12.2%
56765
10.5%
66164
9.6%
25719
8.9%
75442
8.5%
85130
8.0%
44977
 
7.8%
34764
 
7.4%
94600
 
7.2%
Other values (12)53
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number64162
99.9%
Other Punctuation27
 
< 0.1%
Lowercase Letter20
 
< 0.1%
Uppercase Letter4
 
< 0.1%
Space Separator2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012788
19.9%
17813
12.2%
56765
10.5%
66164
9.6%
25719
8.9%
75442
8.5%
85130
8.0%
44977
 
7.8%
34764
 
7.4%
94600
 
7.2%
Lowercase Letter
ValueCountFrequency (%)
a4
20.0%
l4
20.0%
o2
10.0%
t2
10.0%
v2
10.0%
i2
10.0%
b2
10.0%
e2
10.0%
Uppercase Letter
ValueCountFrequency (%)
N2
50.0%
A2
50.0%
Other Punctuation
ValueCountFrequency (%)
.27
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common64191
> 99.9%
Latin24
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
012788
19.9%
17813
12.2%
56765
10.5%
66164
9.6%
25719
8.9%
75442
8.5%
85130
8.0%
44977
 
7.8%
34764
 
7.4%
94600
 
7.2%
Other values (2)29
 
< 0.1%
Latin
ValueCountFrequency (%)
a4
16.7%
l4
16.7%
N2
8.3%
o2
8.3%
t2
8.3%
A2
8.3%
v2
8.3%
i2
8.3%
b2
8.3%
e2
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII64215
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012788
19.9%
17813
12.2%
56765
10.5%
66164
9.6%
25719
8.9%
75442
8.5%
85130
8.0%
44977
 
7.8%
34764
 
7.4%
94600
 
7.2%
Other values (12)53
 
0.1%

2nd Largest Property Use Type
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct59
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
8005 
Retail Store
948 
Parking
929 
Other
 
424
Office
 
384
Other values (54)
1056 

Length

Max length53
Median length13
Mean length12.41460923
Min length5

Characters and Unicode

Total characters145822
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)0.1%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available8005
68.2%
Retail Store948
 
8.1%
Parking929
 
7.9%
Other424
 
3.6%
Office384
 
3.3%
Medical Office285
 
2.4%
Restaurant88
 
0.7%
Financial Office65
 
0.6%
Supermarket/Grocery Store58
 
0.5%
Urgent Care/Clinic/Other Outpatient51
 
0.4%
Other values (49)509
 
4.3%

Length

2021-07-25T00:20:43.638093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available8005
36.5%
not8005
36.5%
store1024
 
4.7%
retail948
 
4.3%
parking929
 
4.2%
office735
 
3.3%
other538
 
2.5%
medical286
 
1.3%
115
 
0.5%
restaurant105
 
0.5%
Other values (85)1254
 
5.7%

Most occurring characters

ValueCountFrequency (%)
a19371
13.3%
l17801
12.2%
e12913
8.9%
i11886
8.2%
t11704
8.0%
10198
 
7.0%
o9591
 
6.6%
b8115
 
5.6%
v8089
 
5.5%
N8045
 
5.5%
Other values (43)28109
19.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter112900
77.4%
Uppercase Letter22158
 
15.2%
Space Separator10198
 
7.0%
Other Punctuation321
 
0.2%
Dash Punctuation189
 
0.1%
Decimal Number46
 
< 0.1%
Open Punctuation5
 
< 0.1%
Close Punctuation5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a19371
17.2%
l17801
15.8%
e12913
11.4%
i11886
10.5%
t11704
10.4%
o9591
8.5%
b8115
7.2%
v8089
7.2%
r3490
 
3.1%
n1870
 
1.7%
Other values (12)8070
7.1%
Uppercase Letter
ValueCountFrequency (%)
N8045
36.3%
A8036
36.3%
O1399
 
6.3%
S1222
 
5.5%
R1170
 
5.3%
P1008
 
4.5%
M358
 
1.6%
C204
 
0.9%
F148
 
0.7%
B105
 
0.5%
Other values (11)463
 
2.1%
Other Punctuation
ValueCountFrequency (%)
/308
96.0%
,8
 
2.5%
.4
 
1.2%
&1
 
0.3%
Decimal Number
ValueCountFrequency (%)
123
50.0%
223
50.0%
Space Separator
ValueCountFrequency (%)
10198
100.0%
Dash Punctuation
ValueCountFrequency (%)
-189
100.0%
Open Punctuation
ValueCountFrequency (%)
(5
100.0%
Close Punctuation
ValueCountFrequency (%)
)5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin135058
92.6%
Common10764
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a19371
14.3%
l17801
13.2%
e12913
9.6%
i11886
8.8%
t11704
8.7%
o9591
7.1%
b8115
6.0%
v8089
6.0%
N8045
6.0%
A8036
6.0%
Other values (33)19507
14.4%
Common
ValueCountFrequency (%)
10198
94.7%
/308
 
2.9%
-189
 
1.8%
123
 
0.2%
223
 
0.2%
,8
 
0.1%
(5
 
< 0.1%
)5
 
< 0.1%
.4
 
< 0.1%
&1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII145822
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a19371
13.3%
l17801
12.2%
e12913
8.9%
i11886
8.2%
t11704
8.0%
10198
 
7.0%
o9591
 
6.6%
b8115
 
5.6%
v8089
 
5.5%
N8045
 
5.5%
Other values (43)28109
19.3%
Distinct2264
Distinct (%)19.3%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
8005 
0
 
144
5000
 
91
10000
 
62
6000
 
57
Other values (2259)
3387 

Length

Max length13
Median length13
Mean length10.24536012
Min length1

Characters and Unicode

Total characters120342
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2008 ?
Unique (%)17.1%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available8005
68.2%
0144
 
1.2%
500091
 
0.8%
1000062
 
0.5%
600057
 
0.5%
300052
 
0.4%
2000045
 
0.4%
400045
 
0.4%
200039
 
0.3%
100039
 
0.3%
Other values (2254)3167
 
27.0%

Length

2021-07-25T00:20:44.180675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not8005
40.5%
available8005
40.5%
0144
 
0.7%
500091
 
0.5%
1000062
 
0.3%
600057
 
0.3%
300052
 
0.3%
400045
 
0.2%
2000045
 
0.2%
100039
 
0.2%
Other values (2255)3206
16.2%

Most occurring characters

ValueCountFrequency (%)
a16010
13.3%
l16010
13.3%
N8005
 
6.7%
o8005
 
6.7%
t8005
 
6.7%
8005
 
6.7%
A8005
 
6.7%
v8005
 
6.7%
i8005
 
6.7%
b8005
 
6.7%
Other values (12)24282
20.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter80050
66.5%
Decimal Number16267
 
13.5%
Uppercase Letter16010
 
13.3%
Space Separator8005
 
6.7%
Other Punctuation10
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05141
31.6%
11846
 
11.3%
21563
 
9.6%
51540
 
9.5%
31149
 
7.1%
61087
 
6.7%
41064
 
6.5%
71023
 
6.3%
8972
 
6.0%
9882
 
5.4%
Lowercase Letter
ValueCountFrequency (%)
a16010
20.0%
l16010
20.0%
o8005
10.0%
t8005
10.0%
v8005
10.0%
i8005
10.0%
b8005
10.0%
e8005
10.0%
Uppercase Letter
ValueCountFrequency (%)
N8005
50.0%
A8005
50.0%
Space Separator
ValueCountFrequency (%)
8005
100.0%
Other Punctuation
ValueCountFrequency (%)
.10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin96060
79.8%
Common24282
 
20.2%

Most frequent character per script

Common
ValueCountFrequency (%)
8005
33.0%
05141
21.2%
11846
 
7.6%
21563
 
6.4%
51540
 
6.3%
31149
 
4.7%
61087
 
4.5%
41064
 
4.4%
71023
 
4.2%
8972
 
4.0%
Other values (2)892
 
3.7%
Latin
ValueCountFrequency (%)
a16010
16.7%
l16010
16.7%
N8005
8.3%
o8005
8.3%
t8005
8.3%
A8005
8.3%
v8005
8.3%
i8005
8.3%
b8005
8.3%
e8005
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII120342
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a16010
13.3%
l16010
13.3%
N8005
 
6.7%
o8005
 
6.7%
t8005
 
6.7%
8005
 
6.7%
A8005
 
6.7%
v8005
 
6.7%
i8005
 
6.7%
b8005
 
6.7%
Other values (12)24282
20.2%

3rd Largest Property Use Type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct50
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
10262 
Retail Store
 
297
Other
 
204
Parking
 
187
Office
 
184
Other values (45)
 
612

Length

Max length53
Median length13
Mean length12.86403882
Min length5

Characters and Unicode

Total characters151101
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available10262
87.4%
Retail Store297
 
2.5%
Other204
 
1.7%
Parking187
 
1.6%
Office184
 
1.6%
Medical Office164
 
1.4%
Restaurant89
 
0.8%
Supermarket/Grocery Store48
 
0.4%
Financial Office45
 
0.4%
Bank Branch26
 
0.2%
Other values (40)240
 
2.0%

Length

2021-07-25T00:20:44.696263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not10262
44.6%
available10262
44.6%
office396
 
1.7%
store349
 
1.5%
retail297
 
1.3%
other262
 
1.1%
parking187
 
0.8%
medical165
 
0.7%
restaurant106
 
0.5%
59
 
0.3%
Other values (75)651
 
2.8%

Most occurring characters

ValueCountFrequency (%)
a21908
14.5%
l21247
14.1%
e12476
8.3%
t11797
7.8%
i11703
7.7%
11250
7.4%
o10923
7.2%
b10310
6.8%
v10296
6.8%
A10280
6.8%
Other values (43)18911
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter116423
77.0%
Uppercase Letter23094
 
15.3%
Space Separator11250
 
7.4%
Other Punctuation183
 
0.1%
Dash Punctuation101
 
0.1%
Decimal Number26
 
< 0.1%
Open Punctuation12
 
< 0.1%
Close Punctuation12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a21908
18.8%
l21247
18.2%
e12476
10.7%
t11797
10.1%
i11703
10.1%
o10923
9.4%
b10310
8.9%
v10296
8.8%
r1387
 
1.2%
c833
 
0.7%
Other values (12)3543
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
A10280
44.5%
N10278
44.5%
O691
 
3.0%
S492
 
2.1%
R451
 
2.0%
P244
 
1.1%
M182
 
0.8%
F116
 
0.5%
C77
 
0.3%
B72
 
0.3%
Other values (11)211
 
0.9%
Other Punctuation
ValueCountFrequency (%)
/148
80.9%
,23
 
12.6%
.11
 
6.0%
&1
 
0.5%
Decimal Number
ValueCountFrequency (%)
113
50.0%
213
50.0%
Space Separator
ValueCountFrequency (%)
11250
100.0%
Dash Punctuation
ValueCountFrequency (%)
-101
100.0%
Open Punctuation
ValueCountFrequency (%)
(12
100.0%
Close Punctuation
ValueCountFrequency (%)
)12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin139517
92.3%
Common11584
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a21908
15.7%
l21247
15.2%
e12476
8.9%
t11797
8.5%
i11703
8.4%
o10923
7.8%
b10310
7.4%
v10296
7.4%
A10280
7.4%
N10278
7.4%
Other values (33)8299
 
5.9%
Common
ValueCountFrequency (%)
11250
97.1%
/148
 
1.3%
-101
 
0.9%
,23
 
0.2%
113
 
0.1%
213
 
0.1%
(12
 
0.1%
)12
 
0.1%
.11
 
0.1%
&1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII151101
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a21908
14.5%
l21247
14.1%
e12476
8.3%
t11797
7.8%
i11703
7.7%
11250
7.4%
o10923
7.2%
b10310
6.8%
v10296
6.8%
A10280
6.8%
Other values (43)18911
12.5%
Distinct964
Distinct (%)8.2%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
10262 
0
 
67
1000
 
37
5000
 
33
2000
 
33
Other values (959)
1314 

Length

Max length13
Median length13
Mean length11.87570237
Min length1

Characters and Unicode

Total characters139492
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique854 ?
Unique (%)7.3%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available10262
87.4%
067
 
0.6%
100037
 
0.3%
500033
 
0.3%
200033
 
0.3%
1000024
 
0.2%
300019
 
0.2%
150019
 
0.2%
50016
 
0.1%
800015
 
0.1%
Other values (954)1221
 
10.4%

Length

2021-07-25T00:20:45.170037image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available10262
46.6%
not10262
46.6%
067
 
0.3%
100037
 
0.2%
500033
 
0.1%
200033
 
0.1%
1000024
 
0.1%
300019
 
0.1%
150019
 
0.1%
50016
 
0.1%
Other values (955)1236
 
5.6%

Most occurring characters

ValueCountFrequency (%)
a20524
14.7%
l20524
14.7%
N10262
7.4%
o10262
7.4%
t10262
7.4%
10262
7.4%
A10262
7.4%
v10262
7.4%
i10262
7.4%
b10262
7.4%
Other values (12)16348
11.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter102620
73.6%
Uppercase Letter20524
 
14.7%
Space Separator10262
 
7.4%
Decimal Number6078
 
4.4%
Other Punctuation8
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01886
31.0%
1769
12.7%
2593
 
9.8%
5567
 
9.3%
4432
 
7.1%
3427
 
7.0%
6399
 
6.6%
8358
 
5.9%
7329
 
5.4%
9318
 
5.2%
Lowercase Letter
ValueCountFrequency (%)
a20524
20.0%
l20524
20.0%
o10262
10.0%
t10262
10.0%
v10262
10.0%
i10262
10.0%
b10262
10.0%
e10262
10.0%
Uppercase Letter
ValueCountFrequency (%)
N10262
50.0%
A10262
50.0%
Space Separator
ValueCountFrequency (%)
10262
100.0%
Other Punctuation
ValueCountFrequency (%)
.8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin123144
88.3%
Common16348
 
11.7%

Most frequent character per script

Common
ValueCountFrequency (%)
10262
62.8%
01886
 
11.5%
1769
 
4.7%
2593
 
3.6%
5567
 
3.5%
4432
 
2.6%
3427
 
2.6%
6399
 
2.4%
8358
 
2.2%
7329
 
2.0%
Other values (2)326
 
2.0%
Latin
ValueCountFrequency (%)
a20524
16.7%
l20524
16.7%
N10262
8.3%
o10262
8.3%
t10262
8.3%
A10262
8.3%
v10262
8.3%
i10262
8.3%
b10262
8.3%
e10262
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII139492
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a20524
14.7%
l20524
14.7%
N10262
7.4%
o10262
7.4%
t10262
7.4%
10262
7.4%
A10262
7.4%
v10262
7.4%
i10262
7.4%
b10262
7.4%
Other values (12)16348
11.7%

Year Built
Real number (ℝ≥0)

Distinct157
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1948.738379
Minimum1600
Maximum2019
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:45.390438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1600
5-th percentile1908
Q11927
median1941
Q31966
95-th percentile2007
Maximum2019
Range419
Interquartile range (IQR)39

Descriptive statistics

Standard deviation30.57638585
Coefficient of variation (CV)0.01569034929
Kurtosis1.692176676
Mean1948.738379
Median Absolute Deviation (MAD)19
Skewness0.2264105677
Sum22889881
Variance934.9153715
MonotonicityNot monotonic
2021-07-25T00:20:45.589913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1928417
 
3.6%
1927412
 
3.5%
1929389
 
3.3%
1930369
 
3.1%
1931327
 
2.8%
1925319
 
2.7%
1926305
 
2.6%
1920287
 
2.4%
1963280
 
2.4%
1962259
 
2.2%
Other values (147)8382
71.4%
ValueCountFrequency (%)
16001
 
< 0.1%
16491
 
< 0.1%
18271
 
< 0.1%
18331
 
< 0.1%
18361
 
< 0.1%
18411
 
< 0.1%
18453
< 0.1%
18482
 
< 0.1%
18506
0.1%
18531
 
< 0.1%
ValueCountFrequency (%)
20191
 
< 0.1%
20169
 
0.1%
201525
 
0.2%
201440
 
0.3%
201366
0.6%
201264
0.5%
201148
 
0.4%
201069
0.6%
2009116
1.0%
2008120
1.0%

Number of Buildings - Self-reported
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct49
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.289971054
Minimum0
Maximum161
Zeros12
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:45.809321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum161
Range161
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.017483699
Coefficient of variation (CV)3.114398332
Kurtosis830.1129274
Mean1.289971054
Median Absolute Deviation (MAD)0
Skewness26.43633489
Sum15152
Variance16.14017527
MonotonicityNot monotonic
2021-07-25T00:20:46.037677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
111268
95.9%
2217
 
1.8%
379
 
0.7%
435
 
0.3%
516
 
0.1%
1214
 
0.1%
814
 
0.1%
012
 
0.1%
612
 
0.1%
109
 
0.1%
Other values (39)70
 
0.6%
ValueCountFrequency (%)
012
 
0.1%
111268
95.9%
2217
 
1.8%
379
 
0.7%
435
 
0.3%
516
 
0.1%
612
 
0.1%
78
 
0.1%
814
 
0.1%
97
 
0.1%
ValueCountFrequency (%)
1611
< 0.1%
1551
< 0.1%
1401
< 0.1%
1311
< 0.1%
1261
< 0.1%
1071
< 0.1%
981
< 0.1%
911
< 0.1%
831
< 0.1%
771
< 0.1%

Occupancy
Real number (ℝ≥0)

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.76255747
Minimum0
Maximum100
Zeros34
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:46.279066image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile95
Q1100
median100
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)0

Descriptive statistics

Standard deviation7.501603477
Coefficient of variation (CV)0.07595594595
Kurtosis114.9917335
Mean98.76255747
Median Absolute Deviation (MAD)0
Skewness-10.00961884
Sum1160065
Variance56.27405473
MonotonicityNot monotonic
2021-07-25T00:20:46.455560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
10010885
92.7%
95391
 
3.3%
90206
 
1.8%
8066
 
0.6%
8549
 
0.4%
034
 
0.3%
7531
 
0.3%
7021
 
0.2%
5013
 
0.1%
4012
 
0.1%
Other values (9)38
 
0.3%
ValueCountFrequency (%)
034
0.3%
54
 
< 0.1%
104
 
< 0.1%
201
 
< 0.1%
255
 
< 0.1%
305
 
< 0.1%
4012
 
0.1%
451
 
< 0.1%
5013
 
0.1%
551
 
< 0.1%
ValueCountFrequency (%)
10010885
92.7%
95391
 
3.3%
90206
 
1.8%
8549
 
0.4%
8066
 
0.6%
7531
 
0.3%
7021
 
0.2%
657
 
0.1%
6010
 
0.1%
551
 
< 0.1%

Metered Areas (Energy)
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Whole Building
11648 
Not Available
 
57
Another configuration
 
31
Common areas only
 
6
Tenant areas only
 
1
Other values (3)
 
3

Length

Max length100
Median length14
Mean length14.03677848
Min length13

Characters and Unicode

Total characters164876
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowWhole Building
2nd rowWhole Building
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Whole Building11648
99.2%
Not Available57
 
0.5%
Another configuration31
 
0.3%
Common areas only6
 
0.1%
Tenant areas only1
 
< 0.1%
Tenant Plug Load/Electricity, Common Area Cooling, Tenant Cooling, Common Area Plug Load/Electricity1
 
< 0.1%
Common Area Cooling, Tenant Plug Load/Electricity, Tenant Cooling, Common Area Plug Load/Electricity1
 
< 0.1%
Tenant Cooling, Common Area Hot Water, Common Area Heating, Common Area Plug Load/Electricity1
 
< 0.1%

Length

2021-07-25T00:20:46.901369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-25T00:20:47.050970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
whole11648
49.5%
building11648
49.5%
not57
 
0.2%
available57
 
0.2%
another31
 
0.1%
configuration31
 
0.1%
common13
 
0.1%
only7
 
< 0.1%
areas7
 
< 0.1%
area7
 
< 0.1%
Other values (7)24
 
0.1%

Most occurring characters

ValueCountFrequency (%)
l23432
14.2%
i23431
14.2%
o11847
7.2%
11784
7.1%
n11779
7.1%
e11763
7.1%
g11690
7.1%
u11684
7.1%
h11679
7.1%
d11653
7.1%
Other values (22)24134
14.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter129588
78.6%
Uppercase Letter23490
 
14.2%
Space Separator11784
 
7.1%
Other Punctuation14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l23432
18.1%
i23431
18.1%
o11847
9.1%
n11779
9.1%
e11763
9.1%
g11690
9.0%
u11684
9.0%
h11679
9.0%
d11653
9.0%
a179
 
0.1%
Other values (9)451
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
W11649
49.6%
B11648
49.6%
A95
 
0.4%
N57
 
0.2%
C18
 
0.1%
T6
 
< 0.1%
P5
 
< 0.1%
L5
 
< 0.1%
E5
 
< 0.1%
H2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
,9
64.3%
/5
35.7%
Space Separator
ValueCountFrequency (%)
11784
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin153078
92.8%
Common11798
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l23432
15.3%
i23431
15.3%
o11847
7.7%
n11779
7.7%
e11763
7.7%
g11690
7.6%
u11684
7.6%
h11679
7.6%
d11653
7.6%
W11649
7.6%
Other values (19)12471
8.1%
Common
ValueCountFrequency (%)
11784
99.9%
,9
 
0.1%
/5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII164876
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l23432
14.2%
i23431
14.2%
o11847
7.2%
11784
7.1%
n11779
7.1%
e11763
7.1%
g11690
7.1%
u11684
7.1%
h11679
7.1%
d11653
7.1%
Other values (22)24134
14.6%

Metered Areas (Water)
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Whole Building
7111 
Not Available
4609 
Combination of common and tenant areas
 
10
Common areas only
 
7
Another configuration
 
6
Other values (2)
 
3

Length

Max length38
Median length14
Mean length13.63536523
Min length13

Characters and Unicode

Total characters160161
Distinct characters28
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNot Available
2nd rowWhole Building
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Whole Building7111
60.5%
Not Available4609
39.2%
Combination of common and tenant areas10
 
0.1%
Common areas only7
 
0.1%
Another configuration6
 
0.1%
Tenant areas only2
 
< 0.1%
Tenant areas (all energy loads)1
 
< 0.1%

Length

2021-07-25T00:20:47.654390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-25T00:20:47.803984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
building7111
30.2%
whole7111
30.2%
not4609
19.6%
available4609
19.6%
areas20
 
0.1%
common17
 
0.1%
tenant13
 
0.1%
of10
 
< 0.1%
and10
 
< 0.1%
combination10
 
< 0.1%
Other values (6)24
 
0.1%

Most occurring characters

ValueCountFrequency (%)
l23452
14.6%
i18863
11.8%
o11812
 
7.4%
11798
 
7.4%
e11761
 
7.3%
a9299
 
5.8%
n7212
 
4.5%
d7122
 
4.4%
g7118
 
4.4%
h7117
 
4.4%
Other values (18)44607
27.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter124895
78.0%
Uppercase Letter23466
 
14.7%
Space Separator11798
 
7.4%
Open Punctuation1
 
< 0.1%
Close Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l23452
18.8%
i18863
15.1%
o11812
9.5%
e11761
9.4%
a9299
 
7.4%
n7212
 
5.8%
d7122
 
5.7%
g7118
 
5.7%
h7117
 
5.7%
u7117
 
5.7%
Other values (9)14022
11.2%
Uppercase Letter
ValueCountFrequency (%)
W7111
30.3%
B7111
30.3%
A4615
19.7%
N4609
19.6%
C17
 
0.1%
T3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
11798
100.0%
Open Punctuation
ValueCountFrequency (%)
(1
100.0%
Close Punctuation
ValueCountFrequency (%)
)1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin148361
92.6%
Common11800
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
l23452
15.8%
i18863
12.7%
o11812
 
8.0%
e11761
 
7.9%
a9299
 
6.3%
n7212
 
4.9%
d7122
 
4.8%
g7118
 
4.8%
h7117
 
4.8%
u7117
 
4.8%
Other values (15)37488
25.3%
Common
ValueCountFrequency (%)
11798
> 99.9%
(1
 
< 0.1%
)1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII160161
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l23452
14.6%
i18863
11.8%
o11812
 
7.4%
11798
 
7.4%
e11761
 
7.3%
a9299
 
5.8%
n7212
 
4.5%
d7122
 
4.4%
g7118
 
4.4%
h7117
 
4.4%
Other values (18)44607
27.9%

ENERGY STAR Score
Categorical

HIGH CARDINALITY

Distinct101
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
2104 
100
 
649
1
 
299
99
 
162
80
 
144
Other values (96)
8388 

Length

Max length13
Median length2
Mean length3.955133663
Min length1

Characters and Unicode

Total characters46457
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot Available
2nd row55
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available2104
 
17.9%
100649
 
5.5%
1299
 
2.5%
99162
 
1.4%
80144
 
1.2%
84142
 
1.2%
86138
 
1.2%
83138
 
1.2%
88136
 
1.2%
73128
 
1.1%
Other values (91)7706
65.6%

Length

2021-07-25T00:20:48.405380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available2104
 
15.2%
not2104
 
15.2%
100649
 
4.7%
1299
 
2.2%
99162
 
1.2%
80144
 
1.0%
84142
 
1.0%
83138
 
1.0%
86138
 
1.0%
88136
 
1.0%
Other values (92)7834
56.6%

Most occurring characters

ValueCountFrequency (%)
a4208
 
9.1%
l4208
 
9.1%
82259
 
4.9%
12206
 
4.7%
92199
 
4.7%
N2104
 
4.5%
o2104
 
4.5%
t2104
 
4.5%
2104
 
4.5%
A2104
 
4.5%
Other values (11)20857
44.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter21040
45.3%
Decimal Number19105
41.1%
Uppercase Letter4208
 
9.1%
Space Separator2104
 
4.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
82259
11.8%
12206
11.5%
92199
11.5%
02100
11.0%
72034
10.6%
61916
10.0%
51783
9.3%
41601
8.4%
31597
8.4%
21410
7.4%
Lowercase Letter
ValueCountFrequency (%)
a4208
20.0%
l4208
20.0%
o2104
10.0%
t2104
10.0%
v2104
10.0%
i2104
10.0%
b2104
10.0%
e2104
10.0%
Uppercase Letter
ValueCountFrequency (%)
N2104
50.0%
A2104
50.0%
Space Separator
ValueCountFrequency (%)
2104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin25248
54.3%
Common21209
45.7%

Most frequent character per script

Common
ValueCountFrequency (%)
82259
10.7%
12206
10.4%
92199
10.4%
2104
9.9%
02100
9.9%
72034
9.6%
61916
9.0%
51783
8.4%
41601
7.5%
31597
7.5%
Latin
ValueCountFrequency (%)
a4208
16.7%
l4208
16.7%
N2104
8.3%
o2104
8.3%
t2104
8.3%
A2104
8.3%
v2104
8.3%
i2104
8.3%
b2104
8.3%
e2104
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII46457
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a4208
 
9.1%
l4208
 
9.1%
82259
 
4.9%
12206
 
4.7%
92199
 
4.7%
N2104
 
4.5%
o2104
 
4.5%
t2104
 
4.5%
2104
 
4.5%
A2104
 
4.5%
Other values (11)20857
44.9%

Site EUI (kBtu/ft²)
Categorical

HIGH CARDINALITY

Distinct1959
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
 
163
75.3
 
29
88.8
 
28
76.3
 
28
66
 
27
Other values (1954)
11471 

Length

Max length13
Median length4
Mean length4.132130087
Min length1

Characters and Unicode

Total characters48536
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique560 ?
Unique (%)4.8%

Sample

1st row305.6
2nd row229.8
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available163
 
1.4%
75.329
 
0.2%
88.828
 
0.2%
76.328
 
0.2%
6627
 
0.2%
72.227
 
0.2%
72.327
 
0.2%
66.127
 
0.2%
70.126
 
0.2%
85.826
 
0.2%
Other values (1949)11338
96.5%

Length

2021-07-25T00:20:48.890092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not163
 
1.4%
available163
 
1.4%
75.329
 
0.2%
88.828
 
0.2%
76.328
 
0.2%
72.227
 
0.2%
72.327
 
0.2%
66.127
 
0.2%
6627
 
0.2%
79.926
 
0.2%
Other values (1950)11364
95.4%

Most occurring characters

ValueCountFrequency (%)
.10381
21.4%
15577
11.5%
74201
8.7%
84040
 
8.3%
64021
 
8.3%
93686
 
7.6%
53402
 
7.0%
23193
 
6.6%
42966
 
6.1%
32899
 
6.0%
Other values (12)4170
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number36036
74.2%
Other Punctuation10381
 
21.4%
Lowercase Letter1630
 
3.4%
Uppercase Letter326
 
0.7%
Space Separator163
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
15577
15.5%
74201
11.7%
84040
11.2%
64021
11.2%
93686
10.2%
53402
9.4%
23193
8.9%
42966
8.2%
32899
8.0%
02051
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
a326
20.0%
l326
20.0%
o163
10.0%
t163
10.0%
v163
10.0%
i163
10.0%
b163
10.0%
e163
10.0%
Uppercase Letter
ValueCountFrequency (%)
N163
50.0%
A163
50.0%
Other Punctuation
ValueCountFrequency (%)
.10381
100.0%
Space Separator
ValueCountFrequency (%)
163
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common46580
96.0%
Latin1956
 
4.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.10381
22.3%
15577
12.0%
74201
9.0%
84040
 
8.7%
64021
 
8.6%
93686
 
7.9%
53402
 
7.3%
23193
 
6.9%
42966
 
6.4%
32899
 
6.2%
Other values (2)2214
 
4.8%
Latin
ValueCountFrequency (%)
a326
16.7%
l326
16.7%
N163
8.3%
o163
8.3%
t163
8.3%
A163
8.3%
v163
8.3%
i163
8.3%
b163
8.3%
e163
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII48536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.10381
21.4%
15577
11.5%
74201
8.7%
84040
 
8.3%
64021
 
8.3%
93686
 
7.6%
53402
 
7.0%
23193
 
6.6%
42966
 
6.1%
32899
 
6.0%
Other values (12)4170
8.6%

Weather Normalized Site EUI (kBtu/ft²)
Categorical

HIGH CARDINALITY

Distinct1944
Distinct (%)16.6%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
1465 
85.5
 
26
74.3
 
25
84.6
 
25
80.8
 
24
Other values (1939)
10181 

Length

Max length13
Median length4
Mean length5.169504512
Min length1

Characters and Unicode

Total characters60721
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique568 ?
Unique (%)4.8%

Sample

1st row303.1
2nd row228.8
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available1465
 
12.5%
85.526
 
0.2%
74.325
 
0.2%
84.625
 
0.2%
80.824
 
0.2%
77.124
 
0.2%
81.724
 
0.2%
8124
 
0.2%
82.723
 
0.2%
85.223
 
0.2%
Other values (1934)10063
85.7%

Length

2021-07-25T00:20:49.390745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available1465
 
11.1%
not1465
 
11.1%
85.526
 
0.2%
74.325
 
0.2%
84.625
 
0.2%
80.824
 
0.2%
81.724
 
0.2%
77.124
 
0.2%
8124
 
0.2%
70.623
 
0.2%
Other values (1935)10086
76.3%

Most occurring characters

ValueCountFrequency (%)
.9208
15.2%
15543
 
9.1%
83640
 
6.0%
73630
 
6.0%
93321
 
5.5%
63278
 
5.4%
a2930
 
4.8%
l2930
 
4.8%
22912
 
4.8%
52827
 
4.7%
Other values (12)20502
33.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32468
53.5%
Lowercase Letter14650
24.1%
Other Punctuation9208
 
15.2%
Uppercase Letter2930
 
4.8%
Space Separator1465
 
2.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
15543
17.1%
83640
11.2%
73630
11.2%
93321
10.2%
63278
10.1%
22912
9.0%
52827
8.7%
32708
8.3%
42644
8.1%
01965
 
6.1%
Lowercase Letter
ValueCountFrequency (%)
a2930
20.0%
l2930
20.0%
o1465
10.0%
t1465
10.0%
v1465
10.0%
i1465
10.0%
b1465
10.0%
e1465
10.0%
Uppercase Letter
ValueCountFrequency (%)
N1465
50.0%
A1465
50.0%
Other Punctuation
ValueCountFrequency (%)
.9208
100.0%
Space Separator
ValueCountFrequency (%)
1465
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common43141
71.0%
Latin17580
29.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.9208
21.3%
15543
12.8%
83640
 
8.4%
73630
 
8.4%
93321
 
7.7%
63278
 
7.6%
22912
 
6.7%
52827
 
6.6%
32708
 
6.3%
42644
 
6.1%
Other values (2)3430
 
8.0%
Latin
ValueCountFrequency (%)
a2930
16.7%
l2930
16.7%
N1465
8.3%
o1465
8.3%
t1465
8.3%
A1465
8.3%
v1465
8.3%
i1465
8.3%
b1465
8.3%
e1465
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII60721
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.9208
15.2%
15543
 
9.1%
83640
 
6.0%
73630
 
6.0%
93321
 
5.5%
63278
 
5.4%
a2930
 
4.8%
l2930
 
4.8%
22912
 
4.8%
52827
 
4.7%
Other values (12)20502
33.8%
Distinct441
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
 
787
4
 
230
4.1
 
221
4.5
 
218
3.7
 
218
Other values (436)
10072 

Length

Max length13
Median length3
Mean length3.690618083
Min length1

Characters and Unicode

Total characters43350
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)1.1%

Sample

1st row37.8
2nd row24.8
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available787
 
6.7%
4230
 
2.0%
4.1221
 
1.9%
4.5218
 
1.9%
3.7218
 
1.9%
3.8217
 
1.8%
3.6213
 
1.8%
3.9211
 
1.8%
4.2210
 
1.8%
3.4208
 
1.8%
Other values (431)9013
76.7%

Length

2021-07-25T00:20:49.928274image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not787
 
6.3%
available787
 
6.3%
4230
 
1.8%
4.1221
 
1.8%
3.7218
 
1.7%
4.5218
 
1.7%
3.8217
 
1.7%
3.6213
 
1.7%
3.9211
 
1.7%
4.2210
 
1.7%
Other values (432)9221
73.6%

Most occurring characters

ValueCountFrequency (%)
.9832
22.7%
13716
 
8.6%
43399
 
7.8%
33296
 
7.6%
52598
 
6.0%
22331
 
5.4%
62180
 
5.0%
71850
 
4.3%
81691
 
3.9%
a1574
 
3.6%
Other values (12)10883
25.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number23287
53.7%
Other Punctuation9832
22.7%
Lowercase Letter7870
 
18.2%
Uppercase Letter1574
 
3.6%
Space Separator787
 
1.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
13716
16.0%
43399
14.6%
33296
14.2%
52598
11.2%
22331
10.0%
62180
9.4%
71850
7.9%
81691
7.3%
91514
6.5%
0712
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
a1574
20.0%
l1574
20.0%
o787
10.0%
t787
10.0%
v787
10.0%
i787
10.0%
b787
10.0%
e787
10.0%
Uppercase Letter
ValueCountFrequency (%)
N787
50.0%
A787
50.0%
Other Punctuation
ValueCountFrequency (%)
.9832
100.0%
Space Separator
ValueCountFrequency (%)
787
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common33906
78.2%
Latin9444
 
21.8%

Most frequent character per script

Common
ValueCountFrequency (%)
.9832
29.0%
13716
 
11.0%
43399
 
10.0%
33296
 
9.7%
52598
 
7.7%
22331
 
6.9%
62180
 
6.4%
71850
 
5.5%
81691
 
5.0%
91514
 
4.5%
Other values (2)1499
 
4.4%
Latin
ValueCountFrequency (%)
a1574
16.7%
l1574
16.7%
N787
8.3%
o787
8.3%
t787
8.3%
A787
8.3%
v787
8.3%
i787
8.3%
b787
8.3%
e787
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII43350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.9832
22.7%
13716
 
8.6%
43399
 
7.8%
33296
 
7.6%
52598
 
6.0%
22331
 
5.4%
62180
 
5.0%
71850
 
4.3%
81691
 
3.9%
a1574
 
3.6%
Other values (12)10883
25.1%

Weather Normalized Site Natural Gas Intensity (therms/ft²)
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct66
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
1963 
0
1851 
0.7
1155 
0.6
1144 
0.5
897 
Other values (61)
4736 

Length

Max length13
Median length3
Mean length4.306742721
Min length1

Characters and Unicode

Total characters50587
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)0.3%

Sample

1st rowNot Available
2nd row2.4
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available1963
16.7%
01851
15.8%
0.71155
9.8%
0.61144
9.7%
0.5897
7.6%
0.1896
7.6%
0.8819
7.0%
0.4602
 
5.1%
0.3534
 
4.5%
0.2532
 
4.5%
Other values (56)1353
11.5%

Length

2021-07-25T00:20:50.467830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available1963
14.3%
not1963
14.3%
01851
13.5%
0.71155
8.4%
0.61144
8.3%
0.5897
6.5%
0.1896
6.5%
0.8819
6.0%
0.4602
 
4.4%
0.3534
 
3.9%
Other values (57)1885
13.8%

Most occurring characters

ValueCountFrequency (%)
08946
17.7%
.7632
15.1%
a3926
 
7.8%
l3926
 
7.8%
N1963
 
3.9%
o1963
 
3.9%
t1963
 
3.9%
1963
 
3.9%
A1963
 
3.9%
v1963
 
3.9%
Other values (12)14379
28.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter19630
38.8%
Decimal Number17436
34.5%
Other Punctuation7632
 
15.1%
Uppercase Letter3926
 
7.8%
Space Separator1963
 
3.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
08946
51.3%
11851
 
10.6%
71181
 
6.8%
61174
 
6.7%
5943
 
5.4%
8837
 
4.8%
2684
 
3.9%
4662
 
3.8%
3629
 
3.6%
9529
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
a3926
20.0%
l3926
20.0%
o1963
10.0%
t1963
10.0%
v1963
10.0%
i1963
10.0%
b1963
10.0%
e1963
10.0%
Uppercase Letter
ValueCountFrequency (%)
N1963
50.0%
A1963
50.0%
Space Separator
ValueCountFrequency (%)
1963
100.0%
Other Punctuation
ValueCountFrequency (%)
.7632
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common27031
53.4%
Latin23556
46.6%

Most frequent character per script

Common
ValueCountFrequency (%)
08946
33.1%
.7632
28.2%
1963
 
7.3%
11851
 
6.8%
71181
 
4.4%
61174
 
4.3%
5943
 
3.5%
8837
 
3.1%
2684
 
2.5%
4662
 
2.4%
Other values (2)1158
 
4.3%
Latin
ValueCountFrequency (%)
a3926
16.7%
l3926
16.7%
N1963
8.3%
o1963
8.3%
t1963
8.3%
A1963
8.3%
v1963
8.3%
i1963
8.3%
b1963
8.3%
e1963
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII50587
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
08946
17.7%
.7632
15.1%
a3926
 
7.8%
l3926
 
7.8%
N1963
 
3.9%
o1963
 
3.9%
t1963
 
3.9%
1963
 
3.9%
A1963
 
3.9%
v1963
 
3.9%
Other values (12)14379
28.4%
Distinct2795
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
1465 
113.9
 
18
120
 
18
123.5
 
18
120.6
 
17
Other values (2790)
10210 

Length

Max length13
Median length5
Mean length5.626000341
Min length1

Characters and Unicode

Total characters66083
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1035 ?
Unique (%)8.8%

Sample

1st row614.2
2nd row401.1
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available1465
 
12.5%
113.918
 
0.2%
12018
 
0.2%
123.518
 
0.2%
120.617
 
0.1%
128.717
 
0.1%
107.717
 
0.1%
115.616
 
0.1%
119.616
 
0.1%
128.616
 
0.1%
Other values (2785)10128
86.2%

Length

2021-07-25T00:20:51.027335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not1465
 
11.1%
available1465
 
11.1%
12018
 
0.1%
123.518
 
0.1%
113.918
 
0.1%
107.717
 
0.1%
120.617
 
0.1%
128.717
 
0.1%
016
 
0.1%
143.616
 
0.1%
Other values (2786)10144
76.8%

Most occurring characters

ValueCountFrequency (%)
19750
14.8%
.9249
14.0%
24436
 
6.7%
33457
 
5.2%
43190
 
4.8%
93029
 
4.6%
53018
 
4.6%
62943
 
4.5%
a2930
 
4.4%
l2930
 
4.4%
Other values (12)21151
32.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number37789
57.2%
Lowercase Letter14650
 
22.2%
Other Punctuation9249
 
14.0%
Uppercase Letter2930
 
4.4%
Space Separator1465
 
2.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
19750
25.8%
24436
11.7%
33457
 
9.1%
43190
 
8.4%
93029
 
8.0%
53018
 
8.0%
62943
 
7.8%
82911
 
7.7%
72862
 
7.6%
02193
 
5.8%
Lowercase Letter
ValueCountFrequency (%)
a2930
20.0%
l2930
20.0%
o1465
10.0%
t1465
10.0%
v1465
10.0%
i1465
10.0%
b1465
10.0%
e1465
10.0%
Uppercase Letter
ValueCountFrequency (%)
N1465
50.0%
A1465
50.0%
Other Punctuation
ValueCountFrequency (%)
.9249
100.0%
Space Separator
ValueCountFrequency (%)
1465
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48503
73.4%
Latin17580
 
26.6%

Most frequent character per script

Common
ValueCountFrequency (%)
19750
20.1%
.9249
19.1%
24436
9.1%
33457
 
7.1%
43190
 
6.6%
93029
 
6.2%
53018
 
6.2%
62943
 
6.1%
82911
 
6.0%
72862
 
5.9%
Other values (2)3658
 
7.5%
Latin
ValueCountFrequency (%)
a2930
16.7%
l2930
16.7%
N1465
8.3%
o1465
8.3%
t1465
8.3%
A1465
8.3%
v1465
8.3%
i1465
8.3%
b1465
8.3%
e1465
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII66083
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19750
14.8%
.9249
14.0%
24436
 
6.7%
33457
 
5.2%
43190
 
4.8%
93029
 
4.6%
53018
 
4.6%
62943
 
4.5%
a2930
 
4.4%
l2930
 
4.4%
Other values (12)21151
32.0%

Fuel Oil #1 Use (kBtu)
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
11737 
2491199.7
 
1
4782156
 
1
5313425.7
 
1
4938946.7
 
1
Other values (5)
 
5

Length

Max length13
Median length13
Mean length12.99659459
Min length7

Characters and Unicode

Total characters152658
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.1%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available11737
99.9%
2491199.71
 
< 0.1%
47821561
 
< 0.1%
5313425.71
 
< 0.1%
4938946.71
 
< 0.1%
4328815.31
 
< 0.1%
208597.31
 
< 0.1%
1663593.71
 
< 0.1%
555999.91
 
< 0.1%
6275849.61
 
< 0.1%

Length

2021-07-25T00:20:51.494084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-25T00:20:51.661667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
not11737
50.0%
available11737
50.0%
5313425.71
 
< 0.1%
208597.31
 
< 0.1%
1663593.71
 
< 0.1%
2491199.71
 
< 0.1%
47821561
 
< 0.1%
4938946.71
 
< 0.1%
4328815.31
 
< 0.1%
555999.91
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a23474
15.4%
l23474
15.4%
N11737
7.7%
o11737
7.7%
t11737
7.7%
11737
7.7%
A11737
7.7%
v11737
7.7%
i11737
7.7%
b11737
7.7%
Other values (12)11814
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter117370
76.9%
Uppercase Letter23474
 
15.4%
Space Separator11737
 
7.7%
Decimal Number69
 
< 0.1%
Other Punctuation8
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
912
17.4%
510
14.5%
38
11.6%
47
10.1%
77
10.1%
26
8.7%
16
8.7%
86
8.7%
66
8.7%
01
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
a23474
20.0%
l23474
20.0%
o11737
10.0%
t11737
10.0%
v11737
10.0%
i11737
10.0%
b11737
10.0%
e11737
10.0%
Uppercase Letter
ValueCountFrequency (%)
N11737
50.0%
A11737
50.0%
Space Separator
ValueCountFrequency (%)
11737
100.0%
Other Punctuation
ValueCountFrequency (%)
.8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin140844
92.3%
Common11814
 
7.7%

Most frequent character per script

Common
ValueCountFrequency (%)
11737
99.3%
912
 
0.1%
510
 
0.1%
.8
 
0.1%
38
 
0.1%
47
 
0.1%
77
 
0.1%
26
 
0.1%
16
 
0.1%
86
 
0.1%
Other values (2)7
 
0.1%
Latin
ValueCountFrequency (%)
a23474
16.7%
l23474
16.7%
N11737
8.3%
o11737
8.3%
t11737
8.3%
A11737
8.3%
v11737
8.3%
i11737
8.3%
b11737
8.3%
e11737
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII152658
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a23474
15.4%
l23474
15.4%
N11737
7.7%
o11737
7.7%
t11737
7.7%
11737
7.7%
A11737
7.7%
v11737
7.7%
i11737
7.7%
b11737
7.7%
Other values (12)11814
7.7%

Fuel Oil #2 Use (kBtu)
Categorical

HIGH CARDINALITY

Distinct1906
Distinct (%)16.2%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
9165 
0
 
403
276000
 
48
138000
 
33
414000
 
24
Other values (1901)
2073 

Length

Max length13
Median length13
Mean length11.69768432
Min length1

Characters and Unicode

Total characters137401
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1827 ?
Unique (%)15.6%

Sample

1st rowNot Available
2nd row1.96248472E7
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available9165
78.0%
0403
 
3.4%
27600048
 
0.4%
13800033
 
0.3%
41400024
 
0.2%
20700020
 
0.2%
55200014
 
0.1%
34500010
 
0.1%
828000.18
 
0.1%
4141386
 
0.1%
Other values (1896)2015
 
17.2%

Length

2021-07-25T00:20:52.290956image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not9165
43.8%
available9165
43.8%
0403
 
1.9%
27600048
 
0.2%
13800033
 
0.2%
41400024
 
0.1%
20700020
 
0.1%
55200014
 
0.1%
34500010
 
< 0.1%
828000.18
 
< 0.1%
Other values (1897)2021
 
9.7%

Most occurring characters

ValueCountFrequency (%)
a18330
13.3%
l18330
13.3%
N9165
 
6.7%
o9165
 
6.7%
t9165
 
6.7%
9165
 
6.7%
A9165
 
6.7%
v9165
 
6.7%
i9165
 
6.7%
b9165
 
6.7%
Other values (13)27421
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter91650
66.7%
Uppercase Letter18458
 
13.4%
Decimal Number16606
 
12.1%
Space Separator9165
 
6.7%
Other Punctuation1522
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02369
14.3%
11866
11.2%
21730
10.4%
31617
9.7%
41613
9.7%
71564
9.4%
81516
9.1%
91476
8.9%
51453
8.7%
61402
8.4%
Lowercase Letter
ValueCountFrequency (%)
a18330
20.0%
l18330
20.0%
o9165
10.0%
t9165
10.0%
v9165
10.0%
i9165
10.0%
b9165
10.0%
e9165
10.0%
Uppercase Letter
ValueCountFrequency (%)
N9165
49.7%
A9165
49.7%
E128
 
0.7%
Space Separator
ValueCountFrequency (%)
9165
100.0%
Other Punctuation
ValueCountFrequency (%)
.1522
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin110108
80.1%
Common27293
 
19.9%

Most frequent character per script

Common
ValueCountFrequency (%)
9165
33.6%
02369
 
8.7%
11866
 
6.8%
21730
 
6.3%
31617
 
5.9%
41613
 
5.9%
71564
 
5.7%
.1522
 
5.6%
81516
 
5.6%
91476
 
5.4%
Other values (2)2855
 
10.5%
Latin
ValueCountFrequency (%)
a18330
16.6%
l18330
16.6%
N9165
8.3%
o9165
8.3%
t9165
8.3%
A9165
8.3%
v9165
8.3%
i9165
8.3%
b9165
8.3%
e9165
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII137401
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a18330
13.3%
l18330
13.3%
N9165
 
6.7%
o9165
 
6.7%
t9165
 
6.7%
9165
 
6.7%
A9165
 
6.7%
v9165
 
6.7%
i9165
 
6.7%
b9165
 
6.7%
Other values (13)27421
20.0%

Fuel Oil #4 Use (kBtu)
Categorical

HIGH CARDINALITY

Distinct1180
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
10425 
0
 
122
292000
 
4
146000
 
3
3065999.9
 
2
Other values (1175)
1190 

Length

Max length13
Median length13
Mean length12.45445258
Min length1

Characters and Unicode

Total characters146290
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1160 ?
Unique (%)9.9%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available10425
88.8%
0122
 
1.0%
2920004
 
< 0.1%
1460003
 
< 0.1%
3065999.92
 
< 0.1%
978397.22
 
< 0.1%
5840002
 
< 0.1%
1188651.72
 
< 0.1%
730000.12
 
< 0.1%
1229734.72
 
< 0.1%
Other values (1170)1180
 
10.0%

Length

2021-07-25T00:20:52.790619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available10425
47.0%
not10425
47.0%
0122
 
0.6%
2920004
 
< 0.1%
1460003
 
< 0.1%
1256534.32
 
< 0.1%
738117.82
 
< 0.1%
2921462
 
< 0.1%
6350999.92
 
< 0.1%
5840002
 
< 0.1%
Other values (1171)1182
 
5.3%

Most occurring characters

ValueCountFrequency (%)
a20850
14.3%
l20850
14.3%
N10425
7.1%
o10425
7.1%
t10425
7.1%
10425
7.1%
A10425
7.1%
v10425
7.1%
i10425
7.1%
b10425
7.1%
Other values (13)21190
14.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter104250
71.3%
Uppercase Letter20971
 
14.3%
Space Separator10425
 
7.1%
Decimal Number9650
 
6.6%
Other Punctuation994
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11058
11.0%
21023
10.6%
71022
10.6%
41017
10.5%
3957
9.9%
9953
9.9%
5943
9.8%
0909
9.4%
6885
9.2%
8883
9.2%
Lowercase Letter
ValueCountFrequency (%)
a20850
20.0%
l20850
20.0%
o10425
10.0%
t10425
10.0%
v10425
10.0%
i10425
10.0%
b10425
10.0%
e10425
10.0%
Uppercase Letter
ValueCountFrequency (%)
N10425
49.7%
A10425
49.7%
E121
 
0.6%
Space Separator
ValueCountFrequency (%)
10425
100.0%
Other Punctuation
ValueCountFrequency (%)
.994
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin125221
85.6%
Common21069
 
14.4%

Most frequent character per script

Common
ValueCountFrequency (%)
10425
49.5%
11058
 
5.0%
21023
 
4.9%
71022
 
4.9%
41017
 
4.8%
.994
 
4.7%
3957
 
4.5%
9953
 
4.5%
5943
 
4.5%
0909
 
4.3%
Other values (2)1768
 
8.4%
Latin
ValueCountFrequency (%)
a20850
16.7%
l20850
16.7%
N10425
8.3%
o10425
8.3%
t10425
8.3%
A10425
8.3%
v10425
8.3%
i10425
8.3%
b10425
8.3%
e10425
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII146290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a20850
14.3%
l20850
14.3%
N10425
7.1%
o10425
7.1%
t10425
7.1%
10425
7.1%
A10425
7.1%
v10425
7.1%
i10425
7.1%
b10425
7.1%
Other values (13)21190
14.5%

Fuel Oil #5 & 6 Use (kBtu)
Categorical

HIGH CARDINALITY

Distinct259
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
11152 
0
 
331
450000
 
4
4459200.1
 
3
449999.9
 
2
Other values (254)
 
254

Length

Max length13
Median length13
Mean length12.56589477
Min length1

Characters and Unicode

Total characters147599
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique254 ?
Unique (%)2.2%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available11152
94.9%
0331
 
2.8%
4500004
 
< 0.1%
4459200.13
 
< 0.1%
449999.92
 
< 0.1%
7030143.71
 
< 0.1%
4669043.61
 
< 0.1%
3001651
 
< 0.1%
1.28301407E71
 
< 0.1%
267826.51
 
< 0.1%
Other values (249)249
 
2.1%

Length

2021-07-25T00:20:53.286294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not11152
48.7%
available11152
48.7%
0331
 
1.4%
4500004
 
< 0.1%
4459200.13
 
< 0.1%
449999.92
 
< 0.1%
4795951
 
< 0.1%
18472501
 
< 0.1%
2.15502647e71
 
< 0.1%
6300433.91
 
< 0.1%
Other values (250)250
 
1.1%

Most occurring characters

ValueCountFrequency (%)
a22304
15.1%
l22304
15.1%
N11152
7.6%
o11152
7.6%
t11152
7.6%
11152
7.6%
A11152
7.6%
v11152
7.6%
i11152
7.6%
b11152
7.6%
Other values (13)13775
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter111520
75.6%
Uppercase Letter22329
 
15.1%
Space Separator11152
 
7.6%
Decimal Number2383
 
1.6%
Other Punctuation215
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0601
25.2%
5250
10.5%
9240
 
10.1%
4219
 
9.2%
1194
 
8.1%
7192
 
8.1%
3180
 
7.6%
2173
 
7.3%
6170
 
7.1%
8164
 
6.9%
Lowercase Letter
ValueCountFrequency (%)
a22304
20.0%
l22304
20.0%
o11152
10.0%
t11152
10.0%
v11152
10.0%
i11152
10.0%
b11152
10.0%
e11152
10.0%
Uppercase Letter
ValueCountFrequency (%)
N11152
49.9%
A11152
49.9%
E25
 
0.1%
Space Separator
ValueCountFrequency (%)
11152
100.0%
Other Punctuation
ValueCountFrequency (%)
.215
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin133849
90.7%
Common13750
 
9.3%

Most frequent character per script

Common
ValueCountFrequency (%)
11152
81.1%
0601
 
4.4%
5250
 
1.8%
9240
 
1.7%
4219
 
1.6%
.215
 
1.6%
1194
 
1.4%
7192
 
1.4%
3180
 
1.3%
2173
 
1.3%
Other values (2)334
 
2.4%
Latin
ValueCountFrequency (%)
a22304
16.7%
l22304
16.7%
N11152
8.3%
o11152
8.3%
t11152
8.3%
A11152
8.3%
v11152
8.3%
i11152
8.3%
b11152
8.3%
e11152
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII147599
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a22304
15.1%
l22304
15.1%
N11152
7.6%
o11152
7.6%
t11152
7.6%
11152
7.6%
A11152
7.6%
v11152
7.6%
i11152
7.6%
b11152
7.6%
Other values (13)13775
9.3%

Diesel #2 Use (kBtu)
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
11730 
0
 
3
1.43517792E7
 
1
2472960.3
 
1
73278
 
1
Other values (10)
 
10

Length

Max length13
Median length13
Mean length12.9903797
Min length1

Characters and Unicode

Total characters152585
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.1%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available11730
99.9%
03
 
< 0.1%
1.43517792E71
 
< 0.1%
2472960.31
 
< 0.1%
732781
 
< 0.1%
276662.41
 
< 0.1%
316562.71
 
< 0.1%
207004.11
 
< 0.1%
1380001
 
< 0.1%
4140001
 
< 0.1%
Other values (5)5
 
< 0.1%

Length

2021-07-25T00:20:53.760111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not11730
50.0%
available11730
50.0%
03
 
< 0.1%
732781
 
< 0.1%
316562.71
 
< 0.1%
207004.11
 
< 0.1%
690001
 
< 0.1%
2070001
 
< 0.1%
2667541
 
< 0.1%
2472960.31
 
< 0.1%
Other values (6)6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a23460
15.4%
l23460
15.4%
N11730
7.7%
o11730
7.7%
t11730
7.7%
11730
7.7%
A11730
7.7%
v11730
7.7%
i11730
7.7%
b11730
7.7%
Other values (13)11825
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter117300
76.9%
Uppercase Letter23461
 
15.4%
Space Separator11730
 
7.7%
Decimal Number88
 
0.1%
Other Punctuation6
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
023
26.1%
213
14.8%
711
12.5%
69
 
10.2%
47
 
8.0%
16
 
6.8%
36
 
6.8%
95
 
5.7%
84
 
4.5%
54
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
a23460
20.0%
l23460
20.0%
o11730
10.0%
t11730
10.0%
v11730
10.0%
i11730
10.0%
b11730
10.0%
e11730
10.0%
Uppercase Letter
ValueCountFrequency (%)
N11730
50.0%
A11730
50.0%
E1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
11730
100.0%
Other Punctuation
ValueCountFrequency (%)
.6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin140761
92.3%
Common11824
 
7.7%

Most frequent character per script

Common
ValueCountFrequency (%)
11730
99.2%
023
 
0.2%
213
 
0.1%
711
 
0.1%
69
 
0.1%
47
 
0.1%
16
 
0.1%
.6
 
0.1%
36
 
0.1%
95
 
< 0.1%
Other values (2)8
 
0.1%
Latin
ValueCountFrequency (%)
a23460
16.7%
l23460
16.7%
N11730
8.3%
o11730
8.3%
t11730
8.3%
A11730
8.3%
v11730
8.3%
i11730
8.3%
b11730
8.3%
e11730
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII152585
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a23460
15.4%
l23460
15.4%
N11730
7.7%
o11730
7.7%
t11730
7.7%
11730
7.7%
A11730
7.7%
v11730
7.7%
i11730
7.7%
b11730
7.7%
Other values (13)11825
7.7%

District Steam Use (kBtu)
Categorical

HIGH CARDINALITY

Distinct927
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
10810 
1.02082936E7
 
4
0
 
3
2381470.8
 
2
1.14810858E7
 
2
Other values (922)
 
925

Length

Max length14
Median length13
Mean length12.76996424
Min length1

Characters and Unicode

Total characters149996
Distinct characters24
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique919 ?
Unique (%)7.8%

Sample

1st row5.15506751E7
2nd row-3.914148026E8
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available10810
92.0%
1.02082936E74
 
< 0.1%
03
 
< 0.1%
2381470.82
 
< 0.1%
1.14810858E72
 
< 0.1%
9.03543681E72
 
< 0.1%
5462645.72
 
< 0.1%
526661522
 
< 0.1%
1.11197219E71
 
< 0.1%
40032911
 
< 0.1%
Other values (917)917
 
7.8%

Length

2021-07-25T00:20:54.257725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not10810
47.9%
available10810
47.9%
1.02082936e74
 
< 0.1%
03
 
< 0.1%
9.03543681e72
 
< 0.1%
1.14810858e72
 
< 0.1%
2381470.82
 
< 0.1%
5462645.72
 
< 0.1%
526661522
 
< 0.1%
10524.81
 
< 0.1%
Other values (918)918
 
4.1%

Most occurring characters

ValueCountFrequency (%)
a21620
14.4%
l21620
14.4%
N10810
7.2%
o10810
7.2%
t10810
7.2%
10810
7.2%
A10810
7.2%
v10810
7.2%
i10810
7.2%
b10810
7.2%
Other values (14)20276
13.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter108100
72.1%
Uppercase Letter22029
 
14.7%
Space Separator10810
 
7.2%
Decimal Number8236
 
5.5%
Other Punctuation819
 
0.5%
Dash Punctuation2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
71057
12.8%
11014
12.3%
2905
11.0%
8808
9.8%
3806
9.8%
4768
9.3%
5759
9.2%
9742
9.0%
6706
8.6%
0671
8.1%
Lowercase Letter
ValueCountFrequency (%)
a21620
20.0%
l21620
20.0%
o10810
10.0%
t10810
10.0%
v10810
10.0%
i10810
10.0%
b10810
10.0%
e10810
10.0%
Uppercase Letter
ValueCountFrequency (%)
N10810
49.1%
A10810
49.1%
E409
 
1.9%
Other Punctuation
ValueCountFrequency (%)
.819
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2
100.0%
Space Separator
ValueCountFrequency (%)
10810
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin130129
86.8%
Common19867
 
13.2%

Most frequent character per script

Common
ValueCountFrequency (%)
10810
54.4%
71057
 
5.3%
11014
 
5.1%
2905
 
4.6%
.819
 
4.1%
8808
 
4.1%
3806
 
4.1%
4768
 
3.9%
5759
 
3.8%
9742
 
3.7%
Other values (3)1379
 
6.9%
Latin
ValueCountFrequency (%)
a21620
16.6%
l21620
16.6%
N10810
8.3%
o10810
8.3%
t10810
8.3%
A10810
8.3%
v10810
8.3%
i10810
8.3%
b10810
8.3%
e10810
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII149996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a21620
14.4%
l21620
14.4%
N10810
7.2%
o10810
7.2%
t10810
7.2%
10810
7.2%
A10810
7.2%
v10810
7.2%
i10810
7.2%
b10810
7.2%
Other values (14)20276
13.5%

Natural Gas Use (kBtu)
Categorical

HIGH CARDINALITY

Distinct10155
Distinct (%)86.5%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
1442 
0
 
46
2428687.2
 
6
5291070.4
 
5
3817199.9
 
5
Other values (10150)
10242 

Length

Max length14
Median length9
Mean length9.14566661
Min length1

Characters and Unicode

Total characters107425
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10071 ?
Unique (%)85.7%

Sample

1st rowNot Available
2nd row933073441
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available1442
 
12.3%
046
 
0.4%
2428687.26
 
0.1%
5291070.45
 
< 0.1%
3817199.95
 
< 0.1%
5452004
 
< 0.1%
2004
 
< 0.1%
82079004
 
< 0.1%
9826900.34
 
< 0.1%
6559300.33
 
< 0.1%
Other values (10145)10223
87.0%

Length

2021-07-25T00:20:54.792268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available1442
 
10.9%
not1442
 
10.9%
046
 
0.3%
2428687.26
 
< 0.1%
3817199.95
 
< 0.1%
5291070.45
 
< 0.1%
2004
 
< 0.1%
5452004
 
< 0.1%
9826900.34
 
< 0.1%
82079004
 
< 0.1%
Other values (10146)10226
77.5%

Most occurring characters

ValueCountFrequency (%)
09513
 
8.9%
18631
 
8.0%
98480
 
7.9%
.8020
 
7.5%
78016
 
7.5%
38004
 
7.5%
27926
 
7.4%
47645
 
7.1%
57330
 
6.8%
67141
 
6.6%
Other values (13)26719
24.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number79382
73.9%
Lowercase Letter14420
 
13.4%
Other Punctuation8020
 
7.5%
Uppercase Letter4161
 
3.9%
Space Separator1442
 
1.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
09513
12.0%
18631
10.9%
98480
10.7%
78016
10.1%
38004
10.1%
27926
10.0%
47645
9.6%
57330
9.2%
67141
9.0%
86696
8.4%
Lowercase Letter
ValueCountFrequency (%)
a2884
20.0%
l2884
20.0%
o1442
10.0%
t1442
10.0%
v1442
10.0%
i1442
10.0%
b1442
10.0%
e1442
10.0%
Uppercase Letter
ValueCountFrequency (%)
N1442
34.7%
A1442
34.7%
E1277
30.7%
Space Separator
ValueCountFrequency (%)
1442
100.0%
Other Punctuation
ValueCountFrequency (%)
.8020
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common88844
82.7%
Latin18581
 
17.3%

Most frequent character per script

Common
ValueCountFrequency (%)
09513
10.7%
18631
9.7%
98480
9.5%
.8020
9.0%
78016
9.0%
38004
9.0%
27926
8.9%
47645
8.6%
57330
8.3%
67141
8.0%
Other values (2)8138
9.2%
Latin
ValueCountFrequency (%)
a2884
15.5%
l2884
15.5%
N1442
7.8%
o1442
7.8%
t1442
7.8%
A1442
7.8%
v1442
7.8%
i1442
7.8%
b1442
7.8%
e1442
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII107425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
09513
 
8.9%
18631
 
8.0%
98480
 
7.9%
.8020
 
7.5%
78016
 
7.5%
38004
 
7.5%
27926
 
7.4%
47645
 
7.1%
57330
 
6.8%
67141
 
6.6%
Other values (13)26719
24.9%
Distinct9632
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
1962 
0
 
41
24695.5
 
6
57397.8
 
5
2
 
4
Other values (9627)
9728 

Length

Max length14
Median length7
Mean length7.597820535
Min length1

Characters and Unicode

Total characters89244
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9532 ?
Unique (%)81.2%

Sample

1st rowNot Available
2nd row9330734.4
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available1962
 
16.7%
041
 
0.3%
24695.56
 
0.1%
57397.85
 
< 0.1%
24
 
< 0.1%
820794
 
< 0.1%
773
 
< 0.1%
63
 
< 0.1%
6.73
 
< 0.1%
53
 
< 0.1%
Other values (9622)9712
82.7%

Length

2021-07-25T00:20:55.753695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available1962
 
14.3%
not1962
 
14.3%
041
 
0.3%
24695.56
 
< 0.1%
57397.85
 
< 0.1%
820794
 
< 0.1%
24
 
< 0.1%
53
 
< 0.1%
773
 
< 0.1%
6.73
 
< 0.1%
Other values (9623)9715
70.9%

Most occurring characters

ValueCountFrequency (%)
.8215
 
9.2%
16838
 
7.7%
46025
 
6.8%
26024
 
6.8%
35900
 
6.6%
55892
 
6.6%
65554
 
6.2%
75290
 
5.9%
84973
 
5.6%
94955
 
5.6%
Other values (13)29578
33.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number55516
62.2%
Lowercase Letter19620
 
22.0%
Other Punctuation8215
 
9.2%
Uppercase Letter3931
 
4.4%
Space Separator1962
 
2.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
16838
12.3%
46025
10.9%
26024
10.9%
35900
10.6%
55892
10.6%
65554
10.0%
75290
9.5%
84973
9.0%
94955
8.9%
04065
7.3%
Lowercase Letter
ValueCountFrequency (%)
a3924
20.0%
l3924
20.0%
o1962
10.0%
t1962
10.0%
v1962
10.0%
i1962
10.0%
b1962
10.0%
e1962
10.0%
Uppercase Letter
ValueCountFrequency (%)
N1962
49.9%
A1962
49.9%
E7
 
0.2%
Space Separator
ValueCountFrequency (%)
1962
100.0%
Other Punctuation
ValueCountFrequency (%)
.8215
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common65693
73.6%
Latin23551
 
26.4%

Most frequent character per script

Common
ValueCountFrequency (%)
.8215
12.5%
16838
10.4%
46025
9.2%
26024
9.2%
35900
9.0%
55892
9.0%
65554
8.5%
75290
8.1%
84973
7.6%
94955
7.5%
Other values (2)6027
9.2%
Latin
ValueCountFrequency (%)
a3924
16.7%
l3924
16.7%
N1962
8.3%
o1962
8.3%
t1962
8.3%
A1962
8.3%
v1962
8.3%
i1962
8.3%
b1962
8.3%
e1962
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII89244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.8215
 
9.2%
16838
 
7.7%
46025
 
6.8%
26024
 
6.8%
35900
 
6.6%
55892
 
6.6%
65554
 
6.2%
75290
 
5.9%
84973
 
5.6%
94955
 
5.6%
Other values (13)29578
33.1%

Electricity Use - Grid Purchase (kBtu)
Categorical

HIGH CARDINALITY

Distinct11406
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
 
244
0
 
23
250213.2
 
7
5195170.4
 
6
2076253.1
 
5
Other values (11401)
11461 

Length

Max length13
Median length9
Mean length8.737612804
Min length1

Characters and Unicode

Total characters102632
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11354 ?
Unique (%)96.7%

Sample

1st row38139374.2
2nd row332365924
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available244
 
2.1%
023
 
0.2%
250213.27
 
0.1%
5195170.46
 
0.1%
2076253.15
 
< 0.1%
1071596.64
 
< 0.1%
2678659.54
 
< 0.1%
276034.24
 
< 0.1%
2349366.74
 
< 0.1%
1929827.23
 
< 0.1%
Other values (11396)11442
97.4%

Length

2021-07-25T00:20:56.277294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not244
 
2.0%
available244
 
2.0%
023
 
0.2%
250213.27
 
0.1%
5195170.46
 
0.1%
2076253.15
 
< 0.1%
2678659.54
 
< 0.1%
1071596.64
 
< 0.1%
276034.24
 
< 0.1%
2349366.74
 
< 0.1%
Other values (11397)11445
95.5%

Most occurring characters

ValueCountFrequency (%)
112394
12.1%
.10280
10.0%
29736
9.5%
39035
8.8%
48710
8.5%
58506
8.3%
88464
8.2%
78458
8.2%
98411
8.2%
68385
8.2%
Other values (12)10253
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number89180
86.9%
Other Punctuation10280
 
10.0%
Lowercase Letter2440
 
2.4%
Uppercase Letter488
 
0.5%
Space Separator244
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
112394
13.9%
29736
10.9%
39035
10.1%
48710
9.8%
58506
9.5%
88464
9.5%
78458
9.5%
98411
9.4%
68385
9.4%
07081
7.9%
Lowercase Letter
ValueCountFrequency (%)
a488
20.0%
l488
20.0%
o244
10.0%
t244
10.0%
v244
10.0%
i244
10.0%
b244
10.0%
e244
10.0%
Uppercase Letter
ValueCountFrequency (%)
N244
50.0%
A244
50.0%
Other Punctuation
ValueCountFrequency (%)
.10280
100.0%
Space Separator
ValueCountFrequency (%)
244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common99704
97.1%
Latin2928
 
2.9%

Most frequent character per script

Common
ValueCountFrequency (%)
112394
12.4%
.10280
10.3%
29736
9.8%
39035
9.1%
48710
8.7%
58506
8.5%
88464
8.5%
78458
8.5%
98411
8.4%
68385
8.4%
Other values (2)7325
7.3%
Latin
ValueCountFrequency (%)
a488
16.7%
l488
16.7%
N244
8.3%
o244
8.3%
t244
8.3%
A244
8.3%
v244
8.3%
i244
8.3%
b244
8.3%
e244
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII102632
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
112394
12.1%
.10280
10.0%
29736
9.5%
39035
8.8%
48710
8.5%
58506
8.3%
88464
8.2%
78458
8.2%
98411
8.2%
68385
8.2%
Other values (12)10253
10.0%
Distinct10879
Distinct (%)92.6%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
 
786
0
 
23
73333.3
 
7
1522617.2
 
6
591746.2
 
5
Other values (10874)
10919 

Length

Max length13
Median length8
Mean length8.425080879
Min length1

Characters and Unicode

Total characters98961
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10840 ?
Unique (%)92.3%

Sample

1st row1.10827705E7
2nd row9.62613121E7
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available786
 
6.7%
023
 
0.2%
73333.37
 
0.1%
1522617.26
 
0.1%
591746.25
 
< 0.1%
3140674
 
< 0.1%
688559.94
 
< 0.1%
765478.34
 
< 0.1%
84427.64
 
< 0.1%
565599.93
 
< 0.1%
Other values (10869)10900
92.8%

Length

2021-07-25T00:20:56.823868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available786
 
6.3%
not786
 
6.3%
023
 
0.2%
73333.37
 
0.1%
1522617.26
 
< 0.1%
591746.25
 
< 0.1%
688559.94
 
< 0.1%
3140674
 
< 0.1%
84427.64
 
< 0.1%
765478.34
 
< 0.1%
Other values (10870)10903
87.0%

Most occurring characters

ValueCountFrequency (%)
.9621
9.7%
19471
9.6%
29442
9.5%
38737
8.8%
48045
8.1%
57710
7.8%
97464
7.5%
77452
7.5%
67319
7.4%
87230
7.3%
Other values (13)16470
16.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number78869
79.7%
Other Punctuation9621
 
9.7%
Lowercase Letter7860
 
7.9%
Uppercase Letter1825
 
1.8%
Space Separator786
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
19471
12.0%
29442
12.0%
38737
11.1%
48045
10.2%
57710
9.8%
97464
9.5%
77452
9.4%
67319
9.3%
87230
9.2%
05999
7.6%
Lowercase Letter
ValueCountFrequency (%)
a1572
20.0%
l1572
20.0%
o786
10.0%
t786
10.0%
v786
10.0%
i786
10.0%
b786
10.0%
e786
10.0%
Uppercase Letter
ValueCountFrequency (%)
N786
43.1%
A786
43.1%
E253
 
13.9%
Other Punctuation
ValueCountFrequency (%)
.9621
100.0%
Space Separator
ValueCountFrequency (%)
786
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common89276
90.2%
Latin9685
 
9.8%

Most frequent character per script

Common
ValueCountFrequency (%)
.9621
10.8%
19471
10.6%
29442
10.6%
38737
9.8%
48045
9.0%
57710
8.6%
97464
8.4%
77452
8.3%
67319
8.2%
87230
8.1%
Other values (2)6785
7.6%
Latin
ValueCountFrequency (%)
a1572
16.2%
l1572
16.2%
N786
8.1%
o786
8.1%
t786
8.1%
A786
8.1%
v786
8.1%
i786
8.1%
b786
8.1%
e786
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII98961
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.9621
9.7%
19471
9.6%
29442
9.5%
38737
8.8%
48045
8.1%
57710
7.8%
97464
7.5%
77452
7.5%
67319
7.4%
87230
7.3%
Other values (13)16470
16.6%

Total GHG Emissions (Metric Tons CO2e)
Categorical

HIGH CARDINALITY

Distinct7818
Distinct (%)66.6%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
0
 
108
Not Available
 
74
293.6
 
8
293.1
 
7
462.7
 
7
Other values (7813)
11542 

Length

Max length13
Median length5
Mean length5.017537885
Min length1

Characters and Unicode

Total characters58936
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5408 ?
Unique (%)46.0%

Sample

1st row6962.2
2nd row55870.4
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0108
 
0.9%
Not Available74
 
0.6%
293.68
 
0.1%
293.17
 
0.1%
462.77
 
0.1%
412.77
 
0.1%
375.47
 
0.1%
428.37
 
0.1%
316.46
 
0.1%
258.66
 
0.1%
Other values (7808)11509
98.0%

Length

2021-07-25T00:20:57.309566image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0108
 
0.9%
not74
 
0.6%
available74
 
0.6%
293.68
 
0.1%
293.17
 
0.1%
428.37
 
0.1%
375.47
 
0.1%
412.77
 
0.1%
462.77
 
0.1%
295.16
 
0.1%
Other values (7809)11515
97.4%

Most occurring characters

ValueCountFrequency (%)
.10386
17.6%
16002
10.2%
35946
10.1%
25534
9.4%
45368
9.1%
54876
8.3%
64524
7.7%
74333
7.4%
94157
7.1%
83970
 
6.7%
Other values (12)3840
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number47588
80.7%
Other Punctuation10386
 
17.6%
Lowercase Letter740
 
1.3%
Uppercase Letter148
 
0.3%
Space Separator74
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
16002
12.6%
35946
12.5%
25534
11.6%
45368
11.3%
54876
10.2%
64524
9.5%
74333
9.1%
94157
8.7%
83970
8.3%
02878
6.0%
Lowercase Letter
ValueCountFrequency (%)
a148
20.0%
l148
20.0%
o74
10.0%
t74
10.0%
v74
10.0%
i74
10.0%
b74
10.0%
e74
10.0%
Uppercase Letter
ValueCountFrequency (%)
N74
50.0%
A74
50.0%
Other Punctuation
ValueCountFrequency (%)
.10386
100.0%
Space Separator
ValueCountFrequency (%)
74
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common58048
98.5%
Latin888
 
1.5%

Most frequent character per script

Common
ValueCountFrequency (%)
.10386
17.9%
16002
10.3%
35946
10.2%
25534
9.5%
45368
9.2%
54876
8.4%
64524
7.8%
74333
7.5%
94157
7.2%
83970
 
6.8%
Other values (2)2952
 
5.1%
Latin
ValueCountFrequency (%)
a148
16.7%
l148
16.7%
N74
8.3%
o74
8.3%
t74
8.3%
A74
8.3%
v74
8.3%
i74
8.3%
b74
8.3%
e74
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII58936
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.10386
17.6%
16002
10.2%
35946
10.1%
25534
9.4%
45368
9.1%
54876
8.3%
64524
7.7%
74333
7.4%
94157
7.1%
83970
 
6.7%
Other values (12)3840
 
6.5%

Direct GHG Emissions (Metric Tons CO2e)
Categorical

HIGH CARDINALITY

Distinct5968
Distinct (%)50.8%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
0
 
892
Not Available
 
83
210.1
 
9
259.1
 
8
280.3
 
8
Other values (5963)
10746 

Length

Max length13
Median length5
Mean length4.498637834
Min length1

Characters and Unicode

Total characters52841
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3336 ?
Unique (%)28.4%

Sample

1st row0
2nd row51016.4
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0892
 
7.6%
Not Available83
 
0.7%
210.19
 
0.1%
259.18
 
0.1%
280.38
 
0.1%
209.58
 
0.1%
267.78
 
0.1%
213.18
 
0.1%
375.58
 
0.1%
2668
 
0.1%
Other values (5958)10706
91.1%

Length

2021-07-25T00:20:57.812220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0892
 
7.5%
available83
 
0.7%
not83
 
0.7%
210.19
 
0.1%
375.58
 
0.1%
267.78
 
0.1%
213.18
 
0.1%
259.18
 
0.1%
209.58
 
0.1%
280.38
 
0.1%
Other values (5959)10714
90.6%

Most occurring characters

ValueCountFrequency (%)
.9715
18.4%
26035
11.4%
15721
10.8%
35067
9.6%
44413
8.4%
53816
 
7.2%
63600
 
6.8%
73453
 
6.5%
93433
 
6.5%
83388
 
6.4%
Other values (12)4200
7.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number42047
79.6%
Other Punctuation9715
 
18.4%
Lowercase Letter830
 
1.6%
Uppercase Letter166
 
0.3%
Space Separator83
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
26035
14.4%
15721
13.6%
35067
12.1%
44413
10.5%
53816
9.1%
63600
8.6%
73453
8.2%
93433
8.2%
83388
8.1%
03121
7.4%
Lowercase Letter
ValueCountFrequency (%)
a166
20.0%
l166
20.0%
o83
10.0%
t83
10.0%
v83
10.0%
i83
10.0%
b83
10.0%
e83
10.0%
Uppercase Letter
ValueCountFrequency (%)
N83
50.0%
A83
50.0%
Other Punctuation
ValueCountFrequency (%)
.9715
100.0%
Space Separator
ValueCountFrequency (%)
83
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common51845
98.1%
Latin996
 
1.9%

Most frequent character per script

Common
ValueCountFrequency (%)
.9715
18.7%
26035
11.6%
15721
11.0%
35067
9.8%
44413
8.5%
53816
 
7.4%
63600
 
6.9%
73453
 
6.7%
93433
 
6.6%
83388
 
6.5%
Other values (2)3204
 
6.2%
Latin
ValueCountFrequency (%)
a166
16.7%
l166
16.7%
N83
8.3%
o83
8.3%
t83
8.3%
A83
8.3%
v83
8.3%
i83
8.3%
b83
8.3%
e83
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII52841
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.9715
18.4%
26035
11.4%
15721
10.8%
35067
9.6%
44413
8.4%
53816
 
7.2%
63600
 
6.8%
73453
 
6.5%
93433
 
6.5%
83388
 
6.4%
Other values (12)4200
7.9%
Distinct5853
Distinct (%)49.8%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
0
 
194
Not Available
 
65
74.6
 
15
75.7
 
15
87.3
 
15
Other values (5848)
11442 

Length

Max length13
Median length5
Mean length4.643027414
Min length1

Characters and Unicode

Total characters54537
Distinct characters23
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3636 ?
Unique (%)31.0%

Sample

1st row6962.2
2nd row4854.1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0194
 
1.7%
Not Available65
 
0.6%
74.615
 
0.1%
75.715
 
0.1%
87.315
 
0.1%
103.714
 
0.1%
89.313
 
0.1%
79.612
 
0.1%
83.312
 
0.1%
8912
 
0.1%
Other values (5843)11379
96.9%

Length

2021-07-25T00:20:58.361761image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0194
 
1.6%
available65
 
0.6%
not65
 
0.6%
74.615
 
0.1%
87.315
 
0.1%
75.715
 
0.1%
103.714
 
0.1%
89.313
 
0.1%
85.112
 
0.1%
8912
 
0.1%
Other values (5844)11391
96.4%

Most occurring characters

ValueCountFrequency (%)
.10344
19.0%
17715
14.1%
25128
9.4%
34382
8.0%
74069
 
7.5%
43983
 
7.3%
63957
 
7.3%
83932
 
7.2%
53834
 
7.0%
93714
 
6.8%
Other values (13)3479
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number43347
79.5%
Other Punctuation10344
 
19.0%
Lowercase Letter650
 
1.2%
Uppercase Letter130
 
0.2%
Space Separator65
 
0.1%
Dash Punctuation1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
17715
17.8%
25128
11.8%
34382
10.1%
74069
9.4%
43983
9.2%
63957
9.1%
83932
9.1%
53834
8.8%
93714
8.6%
02633
 
6.1%
Lowercase Letter
ValueCountFrequency (%)
a130
20.0%
l130
20.0%
o65
10.0%
t65
10.0%
v65
10.0%
i65
10.0%
b65
10.0%
e65
10.0%
Uppercase Letter
ValueCountFrequency (%)
N65
50.0%
A65
50.0%
Other Punctuation
ValueCountFrequency (%)
.10344
100.0%
Dash Punctuation
ValueCountFrequency (%)
-1
100.0%
Space Separator
ValueCountFrequency (%)
65
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common53757
98.6%
Latin780
 
1.4%

Most frequent character per script

Common
ValueCountFrequency (%)
.10344
19.2%
17715
14.4%
25128
9.5%
34382
8.2%
74069
 
7.6%
43983
 
7.4%
63957
 
7.4%
83932
 
7.3%
53834
 
7.1%
93714
 
6.9%
Other values (3)2699
 
5.0%
Latin
ValueCountFrequency (%)
a130
16.7%
l130
16.7%
N65
8.3%
o65
8.3%
t65
8.3%
A65
8.3%
v65
8.3%
i65
8.3%
b65
8.3%
e65
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII54537
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.10344
19.0%
17715
14.1%
25128
9.4%
34382
8.0%
74069
 
7.5%
43983
 
7.3%
63957
 
7.3%
83932
 
7.2%
53834
 
7.0%
93714
 
6.8%
Other values (13)3479
 
6.4%

Property GFA - Self-Reported (ft²)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9466
Distinct (%)80.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean167373.9021
Minimum0
Maximum14217119
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:20:58.573185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile53000
Q166994
median94080
Q3158414
95-th percentile485355.5
Maximum14217119
Range14217119
Interquartile range (IQR)91420

Descriptive statistics

Standard deviation318923.7602
Coefficient of variation (CV)1.905456921
Kurtosis543.5761109
Mean167373.9021
Median Absolute Deviation (MAD)33245.5
Skewness17.51552998
Sum1965973854
Variance1.017123648 × 1011
MonotonicityNot monotonic
2021-07-25T00:20:58.790605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7000066
 
0.6%
6000048
 
0.4%
8000033
 
0.3%
6500030
 
0.3%
7500030
 
0.3%
6300029
 
0.2%
5200028
 
0.2%
5400028
 
0.2%
12000027
 
0.2%
6600026
 
0.2%
Other values (9456)11401
97.1%
ValueCountFrequency (%)
02
< 0.1%
501
< 0.1%
541
< 0.1%
1201
< 0.1%
1211
< 0.1%
1581
< 0.1%
2001
< 0.1%
5001
< 0.1%
27001
< 0.1%
31901
< 0.1%
ValueCountFrequency (%)
142171191
< 0.1%
104775711
< 0.1%
89421761
< 0.1%
78157081
< 0.1%
69404501
< 0.1%
63853821
< 0.1%
58180831
< 0.1%
38891811
< 0.1%
36366831
< 0.1%
34655631
< 0.1%

Water Use (All Water Sources) (kgal)
Categorical

HIGH CARDINALITY

Distinct7230
Distinct (%)61.6%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
3984 
0
 
64
4216.8
 
4
1947.2
 
4
1538.7
 
4
Other values (7225)
7686 

Length

Max length13
Median length6
Mean length8.249616891
Min length1

Characters and Unicode

Total characters96900
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6797 ?
Unique (%)57.9%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available3984
33.9%
064
 
0.5%
4216.84
 
< 0.1%
1947.24
 
< 0.1%
1538.74
 
< 0.1%
624.63
 
< 0.1%
2308.53
 
< 0.1%
3899.63
 
< 0.1%
1400.43
 
< 0.1%
246.13
 
< 0.1%
Other values (7220)7671
65.3%

Length

2021-07-25T00:20:59.377007image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available3984
25.3%
not3984
25.3%
064
 
0.4%
1947.24
 
< 0.1%
4216.84
 
< 0.1%
1538.74
 
< 0.1%
4922.93
 
< 0.1%
4376.93
 
< 0.1%
72153
 
< 0.1%
4939.43
 
< 0.1%
Other values (7221)7674
48.8%

Most occurring characters

ValueCountFrequency (%)
a7968
 
8.2%
l7968
 
8.2%
.6818
 
7.0%
14859
 
5.0%
24474
 
4.6%
34212
 
4.3%
44120
 
4.3%
N3984
 
4.1%
o3984
 
4.1%
t3984
 
4.1%
Other values (12)44529
46.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter39840
41.1%
Decimal Number38290
39.5%
Uppercase Letter7968
 
8.2%
Other Punctuation6818
 
7.0%
Space Separator3984
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
14859
12.7%
24474
11.7%
34212
11.0%
44120
10.8%
53839
10.0%
63637
9.5%
73576
9.3%
83553
9.3%
93420
8.9%
02600
6.8%
Lowercase Letter
ValueCountFrequency (%)
a7968
20.0%
l7968
20.0%
o3984
10.0%
t3984
10.0%
v3984
10.0%
i3984
10.0%
b3984
10.0%
e3984
10.0%
Uppercase Letter
ValueCountFrequency (%)
N3984
50.0%
A3984
50.0%
Space Separator
ValueCountFrequency (%)
3984
100.0%
Other Punctuation
ValueCountFrequency (%)
.6818
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common49092
50.7%
Latin47808
49.3%

Most frequent character per script

Common
ValueCountFrequency (%)
.6818
13.9%
14859
9.9%
24474
9.1%
34212
8.6%
44120
8.4%
3984
8.1%
53839
7.8%
63637
7.4%
73576
7.3%
83553
7.2%
Other values (2)6020
12.3%
Latin
ValueCountFrequency (%)
a7968
16.7%
l7968
16.7%
N3984
8.3%
o3984
8.3%
t3984
8.3%
A3984
8.3%
v3984
8.3%
i3984
8.3%
b3984
8.3%
e3984
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII96900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a7968
 
8.2%
l7968
 
8.2%
.6818
 
7.0%
14859
 
5.0%
24474
 
4.6%
34212
 
4.3%
44120
 
4.3%
N3984
 
4.1%
o3984
 
4.1%
t3984
 
4.1%
Other values (12)44529
46.0%
Distinct5607
Distinct (%)47.7%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
3984 
0
 
67
0.04
 
6
1.16
 
5
35.6
 
5
Other values (5602)
7679 

Length

Max length13
Median length5
Mean length7.649497701
Min length1

Characters and Unicode

Total characters89851
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4017 ?
Unique (%)34.2%

Sample

1st rowNot Available
2nd rowNot Available
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available3984
33.9%
067
 
0.6%
0.046
 
0.1%
1.165
 
< 0.1%
35.65
 
< 0.1%
53.485
 
< 0.1%
34.75
 
< 0.1%
26.515
 
< 0.1%
36.655
 
< 0.1%
36.745
 
< 0.1%
Other values (5597)7654
65.2%

Length

2021-07-25T00:20:59.877702image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
available3984
25.3%
not3984
25.3%
067
 
0.4%
0.046
 
< 0.1%
36.745
 
< 0.1%
34.75
 
< 0.1%
35.65
 
< 0.1%
26.515
 
< 0.1%
53.485
 
< 0.1%
1.165
 
< 0.1%
Other values (5598)7659
48.7%

Most occurring characters

ValueCountFrequency (%)
a7968
 
8.9%
l7968
 
8.9%
.7633
 
8.5%
14064
 
4.5%
N3984
 
4.4%
o3984
 
4.4%
t3984
 
4.4%
3984
 
4.4%
A3984
 
4.4%
v3984
 
4.4%
Other values (12)38314
42.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter39840
44.3%
Decimal Number30426
33.9%
Uppercase Letter7968
 
8.9%
Other Punctuation7633
 
8.5%
Space Separator3984
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
14064
13.4%
33529
11.6%
23411
11.2%
43370
11.1%
53152
10.4%
62905
9.5%
72828
9.3%
82774
9.1%
92518
8.3%
01875
6.2%
Lowercase Letter
ValueCountFrequency (%)
a7968
20.0%
l7968
20.0%
o3984
10.0%
t3984
10.0%
v3984
10.0%
i3984
10.0%
b3984
10.0%
e3984
10.0%
Uppercase Letter
ValueCountFrequency (%)
N3984
50.0%
A3984
50.0%
Space Separator
ValueCountFrequency (%)
3984
100.0%
Other Punctuation
ValueCountFrequency (%)
.7633
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin47808
53.2%
Common42043
46.8%

Most frequent character per script

Common
ValueCountFrequency (%)
.7633
18.2%
14064
9.7%
3984
9.5%
33529
8.4%
23411
8.1%
43370
8.0%
53152
7.5%
62905
 
6.9%
72828
 
6.7%
82774
 
6.6%
Other values (2)4393
10.4%
Latin
ValueCountFrequency (%)
a7968
16.7%
l7968
16.7%
N3984
8.3%
o3984
8.3%
t3984
8.3%
A3984
8.3%
v3984
8.3%
i3984
8.3%
b3984
8.3%
e3984
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII89851
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a7968
 
8.9%
l7968
 
8.9%
.7633
 
8.5%
14064
 
4.5%
N3984
 
4.4%
o3984
 
4.4%
t3984
 
4.4%
3984
 
4.4%
A3984
 
4.4%
v3984
 
4.4%
Other values (12)38314
42.6%

Source EUI (kBtu/ft²)
Categorical

HIGH CARDINALITY

Distinct2920
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
Not Available
 
163
107.7
 
22
118.4
 
20
115.5
 
19
114.6
 
19
Other values (2915)
11503 

Length

Max length13
Median length5
Mean length4.6689937
Min length1

Characters and Unicode

Total characters54842
Distinct characters22
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1060 ?
Unique (%)9.0%

Sample

1st row619.4
2nd row404.3
3rd rowNot Available
4th rowNot Available
5th rowNot Available

Common Values

ValueCountFrequency (%)
Not Available163
 
1.4%
107.722
 
0.2%
118.420
 
0.2%
115.519
 
0.2%
114.619
 
0.2%
102.719
 
0.2%
10419
 
0.2%
121.319
 
0.2%
118.219
 
0.2%
120.719
 
0.2%
Other values (2910)11408
97.1%

Length

2021-07-25T00:21:00.416257image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not163
 
1.4%
available163
 
1.4%
107.722
 
0.2%
118.420
 
0.2%
120.719
 
0.2%
118.219
 
0.2%
10419
 
0.2%
121.319
 
0.2%
114.619
 
0.2%
102.719
 
0.2%
Other values (2911)11427
96.0%

Most occurring characters

ValueCountFrequency (%)
110970
20.0%
.10461
19.1%
24904
8.9%
33794
 
6.9%
43555
 
6.5%
93490
 
6.4%
83363
 
6.1%
53256
 
5.9%
63193
 
5.8%
73154
 
5.8%
Other values (12)4702
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number42262
77.1%
Other Punctuation10461
 
19.1%
Lowercase Letter1630
 
3.0%
Uppercase Letter326
 
0.6%
Space Separator163
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
110970
26.0%
24904
11.6%
33794
 
9.0%
43555
 
8.4%
93490
 
8.3%
83363
 
8.0%
53256
 
7.7%
63193
 
7.6%
73154
 
7.5%
02583
 
6.1%
Lowercase Letter
ValueCountFrequency (%)
a326
20.0%
l326
20.0%
o163
10.0%
t163
10.0%
v163
10.0%
i163
10.0%
b163
10.0%
e163
10.0%
Uppercase Letter
ValueCountFrequency (%)
N163
50.0%
A163
50.0%
Other Punctuation
ValueCountFrequency (%)
.10461
100.0%
Space Separator
ValueCountFrequency (%)
163
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common52886
96.4%
Latin1956
 
3.6%

Most frequent character per script

Common
ValueCountFrequency (%)
110970
20.7%
.10461
19.8%
24904
9.3%
33794
 
7.2%
43555
 
6.7%
93490
 
6.6%
83363
 
6.4%
53256
 
6.2%
63193
 
6.0%
73154
 
6.0%
Other values (2)2746
 
5.2%
Latin
ValueCountFrequency (%)
a326
16.7%
l326
16.7%
N163
8.3%
o163
8.3%
t163
8.3%
A163
8.3%
v163
8.3%
i163
8.3%
b163
8.3%
e163
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII54842
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
110970
20.0%
.10461
19.1%
24904
8.9%
33794
 
6.9%
43555
 
6.5%
93490
 
6.4%
83363
 
6.1%
53256
 
5.9%
63193
 
5.8%
73154
 
5.8%
Other values (12)4702
8.6%

Release Date
Categorical

HIGH CARDINALITY

Distinct3537
Distinct (%)30.1%
Missing0
Missing (%)0.0%
Memory size91.9 KiB
05/01/2017 02:58:14 PM
1258 
05/01/2017 07:43:09 PM
 
306
05/01/2017 01:01:42 PM
 
173
05/01/2017 10:35:35 PM
 
153
04/27/2017 11:31:08 AM
 
146
Other values (3532)
9710 

Length

Max length22
Median length22
Mean length22
Min length22

Characters and Unicode

Total characters258412
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2518 ?
Unique (%)21.4%

Sample

1st row05/01/2017 05:32:03 PM
2nd row04/27/2017 11:23:27 AM
3rd row04/27/2017 11:23:27 AM
4th row04/27/2017 11:23:27 AM
5th row04/27/2017 11:23:27 AM

Common Values

ValueCountFrequency (%)
05/01/2017 02:58:14 PM1258
 
10.7%
05/01/2017 07:43:09 PM306
 
2.6%
05/01/2017 01:01:42 PM173
 
1.5%
05/01/2017 10:35:35 PM153
 
1.3%
04/27/2017 11:31:08 AM146
 
1.2%
04/24/2017 06:52:20 PM105
 
0.9%
05/01/2017 04:04:27 PM93
 
0.8%
04/24/2017 10:27:49 AM92
 
0.8%
05/01/2017 10:59:39 PM74
 
0.6%
04/28/2017 06:19:42 PM73
 
0.6%
Other values (3527)9273
78.9%

Length

2021-07-25T00:21:00.909940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pm9178
26.0%
05/01/20173266
 
9.3%
am2568
 
7.3%
04/28/20171406
 
4.0%
02:58:141258
 
3.6%
04/26/20171092
 
3.1%
04/27/2017923
 
2.6%
04/24/2017669
 
1.9%
04/25/2017621
 
1.8%
07:43:09306
 
0.9%
Other values (3516)13951
39.6%

Most occurring characters

ValueCountFrequency (%)
043672
16.9%
129477
11.4%
226821
10.4%
/23492
9.1%
23492
9.1%
:23492
9.1%
715797
 
6.1%
415556
 
6.0%
512129
 
4.7%
M11746
 
4.5%
Other values (6)32738
12.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number164444
63.6%
Other Punctuation46984
 
18.2%
Space Separator23492
 
9.1%
Uppercase Letter23492
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
043672
26.6%
129477
17.9%
226821
16.3%
715797
 
9.6%
415556
 
9.5%
512129
 
7.4%
37883
 
4.8%
85146
 
3.1%
64101
 
2.5%
93862
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
M11746
50.0%
P9178
39.1%
A2568
 
10.9%
Other Punctuation
ValueCountFrequency (%)
/23492
50.0%
:23492
50.0%
Space Separator
ValueCountFrequency (%)
23492
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common234920
90.9%
Latin23492
 
9.1%

Most frequent character per script

Common
ValueCountFrequency (%)
043672
18.6%
129477
12.5%
226821
11.4%
/23492
10.0%
23492
10.0%
:23492
10.0%
715797
 
6.7%
415556
 
6.6%
512129
 
5.2%
37883
 
3.4%
Other values (3)13109
 
5.6%
Latin
ValueCountFrequency (%)
M11746
50.0%
P9178
39.1%
A2568
 
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII258412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
043672
16.9%
129477
11.4%
226821
10.4%
/23492
9.1%
23492
9.1%
:23492
9.1%
715797
 
6.1%
415556
 
6.0%
512129
 
4.7%
M11746
 
4.5%
Other values (6)32738
12.7%

Water Required?
Boolean

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing118
Missing (%)1.0%
Memory size23.1 KiB
True
7552 
False
4076 
(Missing)
 
118
ValueCountFrequency (%)
True7552
64.3%
False4076
34.7%
(Missing)118
 
1.0%
2021-07-25T00:21:01.036598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

DOF Benchmarking Submission Status
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing30
Missing (%)0.3%
Memory size91.9 KiB
In Compliance
11716 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters152308
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIn Compliance
2nd rowIn Compliance
3rd rowIn Compliance
4th rowIn Compliance
5th rowIn Compliance

Common Values

ValueCountFrequency (%)
In Compliance11716
99.7%
(Missing)30
 
0.3%

Length

2021-07-25T00:21:01.418575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-25T00:21:01.558204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
compliance11716
50.0%
in11716
50.0%

Most occurring characters

ValueCountFrequency (%)
n23432
15.4%
I11716
7.7%
11716
7.7%
C11716
7.7%
o11716
7.7%
m11716
7.7%
p11716
7.7%
l11716
7.7%
i11716
7.7%
a11716
7.7%
Other values (2)23432
15.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter117160
76.9%
Uppercase Letter23432
 
15.4%
Space Separator11716
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n23432
20.0%
o11716
10.0%
m11716
10.0%
p11716
10.0%
l11716
10.0%
i11716
10.0%
a11716
10.0%
c11716
10.0%
e11716
10.0%
Uppercase Letter
ValueCountFrequency (%)
I11716
50.0%
C11716
50.0%
Space Separator
ValueCountFrequency (%)
11716
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin140592
92.3%
Common11716
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n23432
16.7%
I11716
8.3%
C11716
8.3%
o11716
8.3%
m11716
8.3%
p11716
8.3%
l11716
8.3%
i11716
8.3%
a11716
8.3%
c11716
8.3%
Common
ValueCountFrequency (%)
11716
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII152308
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n23432
15.4%
I11716
7.7%
11716
7.7%
C11716
7.7%
o11716
7.7%
m11716
7.7%
p11716
7.7%
l11716
7.7%
i11716
7.7%
a11716
7.7%
Other values (2)23432
15.4%

Latitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9179
Distinct (%)96.8%
Missing2263
Missing (%)19.3%
Infinite0
Infinite (%)0.0%
Mean40.75437906
Minimum40.516065
Maximum40.912869
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:21:01.701817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum40.516065
5-th percentile40.6082432
Q140.707226
median40.75913
Q340.8176235
95-th percentile40.8735034
Maximum40.912869
Range0.396804
Interquartile range (IQR)0.1103975

Descriptive statistics

Standard deviation0.08012027673
Coefficient of variation (CV)0.001965930498
Kurtosis-0.5904315159
Mean40.75437906
Median Absolute Deviation (MAD)0.054454
Skewness-0.353540957
Sum386473.7767
Variance0.006419258743
MonotonicityNot monotonic
2021-07-25T00:21:01.937160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.80975226
 
0.2%
40.6902478
 
0.1%
40.8254167
 
0.1%
40.807095
 
< 0.1%
40.7578595
 
< 0.1%
40.8461815
 
< 0.1%
40.6856625
 
< 0.1%
40.6604884
 
< 0.1%
40.6902054
 
< 0.1%
40.7712774
 
< 0.1%
Other values (9169)9410
80.1%
(Missing)2263
 
19.3%
ValueCountFrequency (%)
40.5160651
< 0.1%
40.5167251
< 0.1%
40.521361
< 0.1%
40.5267421
< 0.1%
40.5276541
< 0.1%
40.5421051
< 0.1%
40.5453441
< 0.1%
40.548431
< 0.1%
40.5536721
< 0.1%
40.5539931
< 0.1%
ValueCountFrequency (%)
40.9128691
< 0.1%
40.9128281
< 0.1%
40.9117971
< 0.1%
40.9115881
< 0.1%
40.9101811
< 0.1%
40.9100261
< 0.1%
40.9098041
< 0.1%
40.9091781
< 0.1%
40.9088371
< 0.1%
40.908731
< 0.1%

Longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9019
Distinct (%)95.1%
Missing2263
Missing (%)19.3%
Infinite0
Infinite (%)0.0%
Mean-73.95705722
Minimum-74.243582
Maximum-73.715543
Zeros0
Zeros (%)0.0%
Negative9483
Negative (%)80.7%
Memory size91.9 KiB
2021-07-25T00:21:02.200485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-74.243582
5-th percentile-74.0093706
Q1-73.984662
median-73.96281
Q3-73.932443
95-th percentile-73.8742071
Maximum-73.715543
Range0.528039
Interquartile range (IQR)0.052219

Descriptive statistics

Standard deviation0.04633735305
Coefficient of variation (CV)-0.000626544035
Kurtosis3.705120238
Mean-73.95705722
Median Absolute Deviation (MAD)0.024023
Skewness-0.04260893678
Sum-701334.7736
Variance0.002147150288
MonotonicityNot monotonic
2021-07-25T00:21:02.456800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.96021726
 
0.2%
-73.964338
 
0.1%
-73.9400137
 
0.1%
-73.9608475
 
< 0.1%
-73.7155435
 
< 0.1%
-73.9196425
 
< 0.1%
-73.9703585
 
< 0.1%
-73.8257374
 
< 0.1%
-73.9832994
 
< 0.1%
-73.961944
 
< 0.1%
Other values (9009)9410
80.1%
(Missing)2263
 
19.3%
ValueCountFrequency (%)
-74.2435821
< 0.1%
-74.2298871
< 0.1%
-74.2244641
< 0.1%
-74.2171431
< 0.1%
-74.2159851
< 0.1%
-74.2008252
< 0.1%
-74.196881
< 0.1%
-74.1928361
< 0.1%
-74.1927151
< 0.1%
-74.192421
< 0.1%
ValueCountFrequency (%)
-73.7155435
< 0.1%
-73.7405431
 
< 0.1%
-73.743821
 
< 0.1%
-73.7438881
 
< 0.1%
-73.7450251
 
< 0.1%
-73.7490571
 
< 0.1%
-73.7510481
 
< 0.1%
-73.7511511
 
< 0.1%
-73.751211
 
< 0.1%
-73.7522761
 
< 0.1%

Community Board
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct19
Distinct (%)0.2%
Missing2263
Missing (%)19.3%
Infinite0
Infinite (%)0.0%
Mean7.140672783
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:21:02.690176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q39
95-th percentile14
Maximum56
Range55
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.954128754
Coefficient of variation (CV)0.5537473672
Kurtosis2.030256012
Mean7.140672783
Median Absolute Deviation (MAD)2
Skewness0.6381740782
Sum67715
Variance15.6351342
MonotonicityNot monotonic
2021-07-25T00:21:02.859693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
51313
11.2%
81234
10.5%
71052
9.0%
2742
 
6.3%
4725
 
6.2%
12687
 
5.8%
1660
 
5.6%
6641
 
5.5%
9526
 
4.5%
14377
 
3.2%
Other values (9)1526
13.0%
(Missing)2263
19.3%
ValueCountFrequency (%)
1660
5.6%
2742
6.3%
3259
 
2.2%
4725
6.2%
51313
11.2%
6641
5.5%
71052
9.0%
81234
10.5%
9526
4.5%
10348
 
3.0%
ValueCountFrequency (%)
561
 
< 0.1%
1854
 
0.5%
17109
 
0.9%
1635
 
0.3%
15220
 
1.9%
14377
3.2%
13135
 
1.1%
12687
5.8%
11365
3.1%
10348
3.0%

Council District
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct44
Distinct (%)0.5%
Missing2263
Missing (%)19.3%
Infinite0
Infinite (%)0.0%
Mean15.77127491
Minimum1
Maximum51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:21:03.119996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median9
Q333
95-th percentile47
Maximum51
Range50
Interquartile range (IQR)29

Descriptive statistics

Standard deviation15.67437481
Coefficient of variation (CV)0.9938559118
Kurtosis-0.5839062863
Mean15.77127491
Median Absolute Deviation (MAD)6
Skewness1.003232193
Sum149559
Variance245.6860257
MonotonicityNot monotonic
2021-07-25T00:21:03.370358image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
41326
 
11.3%
3852
 
7.3%
6582
 
5.0%
1514
 
4.4%
2433
 
3.7%
5429
 
3.7%
11427
 
3.6%
7373
 
3.2%
10361
 
3.1%
14335
 
2.9%
Other values (34)3851
32.8%
(Missing)2263
19.3%
ValueCountFrequency (%)
1514
 
4.4%
2433
 
3.7%
3852
7.3%
41326
11.3%
5429
 
3.7%
6582
5.0%
7373
 
3.2%
8179
 
1.5%
9219
 
1.9%
10361
 
3.1%
ValueCountFrequency (%)
5125
 
0.2%
5034
 
0.3%
4998
 
0.8%
48317
2.7%
47115
 
1.0%
4632
 
0.3%
45143
1.2%
44153
1.3%
43148
1.3%
4276
 
0.6%

Census Tract
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct807
Distinct (%)8.5%
Missing2263
Missing (%)19.3%
Infinite0
Infinite (%)0.0%
Mean4977.596647
Minimum1
Maximum155101
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size91.9 KiB
2021-07-25T00:21:03.615705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile34.1
Q1100
median201
Q3531.5
95-th percentile33201
Maximum155101
Range155100
Interquartile range (IQR)431.5

Descriptive statistics

Standard deviation13520.42299
Coefficient of variation (CV)2.716255243
Kurtosis22.59339667
Mean4977.596647
Median Absolute Deviation (MAD)127
Skewness4.120780708
Sum47202549
Variance182801837.8
MonotonicityNot monotonic
2021-07-25T00:21:03.865140image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2167
 
0.6%
10966
 
0.6%
7464
 
0.5%
13764
 
0.5%
19564
 
0.5%
8264
 
0.5%
8462
 
0.5%
5462
 
0.5%
9661
 
0.5%
19960
 
0.5%
Other values (797)8849
75.3%
(Missing)2263
 
19.3%
ValueCountFrequency (%)
17
 
0.1%
22
 
< 0.1%
312
 
0.1%
610
 
0.1%
750
0.4%
86
 
0.1%
958
0.5%
1111
 
0.1%
128
 
0.1%
1339
0.3%
ValueCountFrequency (%)
1551015
< 0.1%
1176021
 
< 0.1%
1142023
< 0.1%
1072011
 
< 0.1%
1058043
< 0.1%
1032021
 
< 0.1%
1010023
< 0.1%
1010014
< 0.1%
998022
 
< 0.1%
998011
 
< 0.1%

NTA
Categorical

HIGH CARDINALITY
MISSING

Distinct144
Distinct (%)1.5%
Missing2263
Missing (%)19.3%
Memory size91.9 KiB
Midtown-Midtown South
720 
Upper East Side-Carnegie Hill
 
458
Upper West Side
 
439
Hudson Yards-Chelsea-Flatiron-Union Square
 
416
Turtle Bay-East Midtown
 
300
Other values (139)
7150 

Length

Max length75
Median length75
Mean length75
Min length75

Characters and Unicode

Total characters711225
Distinct characters53
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowTurtle Bay-East Midtown
2nd rowWashington Heights South
3rd rowWashington Heights South
4th rowWashington Heights South
5th rowWashington Heights South

Common Values

ValueCountFrequency (%)
Midtown-Midtown South 720
 
6.1%
Upper East Side-Carnegie Hill 458
 
3.9%
Upper West Side 439
 
3.7%
Hudson Yards-Chelsea-Flatiron-Union Square 416
 
3.5%
Turtle Bay-East Midtown 300
 
2.6%
West Village 253
 
2.2%
SoHo-TriBeCa-Civic Center-Little Italy 230
 
2.0%
Flatbush 227
 
1.9%
Lenox Hill-Roosevelt Island 223
 
1.9%
Murray Hill-Kips Bay 214
 
1.8%
Other values (134)6003
51.1%
(Missing)2263
 
19.3%

Length

2021-07-25T00:21:04.432526image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
south1190
 
5.1%
east998
 
4.3%
west972
 
4.2%
heights944
 
4.1%
upper897
 
3.9%
midtown-midtown720
 
3.1%
hill701
 
3.0%
square633
 
2.7%
north621
 
2.7%
village556
 
2.4%
Other values (211)14907
64.4%

Most occurring characters

ValueCountFrequency (%)
523840
73.7%
e16361
 
2.3%
t14222
 
2.0%
o13290
 
1.9%
i13187
 
1.9%
a12105
 
1.7%
r11971
 
1.7%
n11732
 
1.6%
l9848
 
1.4%
s9182
 
1.3%
Other values (43)75487
 
10.6%

Most occurring categories

ValueCountFrequency (%)
Space Separator523840
73.7%
Lowercase Letter149754
 
21.1%
Uppercase Letter30847
 
4.3%
Dash Punctuation6715
 
0.9%
Other Punctuation47
 
< 0.1%
Open Punctuation11
 
< 0.1%
Close Punctuation11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e16361
10.9%
t14222
9.5%
o13290
8.9%
i13187
8.8%
a12105
 
8.1%
r11971
 
8.0%
n11732
 
7.8%
l9848
 
6.6%
s9182
 
6.1%
d6598
 
4.4%
Other values (15)31258
20.9%
Uppercase Letter
ValueCountFrequency (%)
H4045
13.1%
S3732
12.1%
M3344
10.8%
C3149
10.2%
B2553
 
8.3%
W1691
 
5.5%
U1539
 
5.0%
E1370
 
4.4%
P1246
 
4.0%
F1069
 
3.5%
Other values (12)7109
23.0%
Other Punctuation
ValueCountFrequency (%)
.30
63.8%
'17
36.2%
Space Separator
ValueCountFrequency (%)
523840
100.0%
Dash Punctuation
ValueCountFrequency (%)
-6715
100.0%
Open Punctuation
ValueCountFrequency (%)
(11
100.0%
Close Punctuation
ValueCountFrequency (%)
)11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common530624
74.6%
Latin180601
 
25.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e16361
 
9.1%
t14222
 
7.9%
o13290
 
7.4%
i13187
 
7.3%
a12105
 
6.7%
r11971
 
6.6%
n11732
 
6.5%
l9848
 
5.5%
s9182
 
5.1%
d6598
 
3.7%
Other values (37)62105
34.4%
Common
ValueCountFrequency (%)
523840
98.7%
-6715
 
1.3%
.30
 
< 0.1%
'17
 
< 0.1%
(11
 
< 0.1%
)11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII711225
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
523840
73.7%
e16361
 
2.3%
t14222
 
2.0%
o13290
 
1.9%
i13187
 
1.9%
a12105
 
1.7%
r11971
 
1.7%
n11732
 
1.6%
l9848
 
1.4%
s9182
 
1.3%
Other values (43)75487
 
10.6%

Interactions

2021-07-25T00:19:48.001909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:48.281370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:48.548656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:48.805004image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:49.036385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:49.314608image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:49.560948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:49.834606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:50.090856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:50.328221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:50.551623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:50.797965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:51.051322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:51.283666image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:51.545964image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:51.806269image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:52.035687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:52.292008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:52.514376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:52.783654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:53.053932image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:53.294334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:53.529660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:53.759047image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:54.118114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:54.382379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:54.626761image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:54.895043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:55.125392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:55.398695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:55.642249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:55.927280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:56.192576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:56.437884image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:56.678272image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:56.916640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:57.170952image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:57.396321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:57.617727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:57.829163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:58.028629image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:58.264001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:58.465492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:58.718784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:58.938232image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:59.143683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:59.352089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:59.565520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:19:59.797898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:00.046233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:00.295569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:00.582799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:00.816176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:01.102440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:01.336811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:01.618031image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:01.866368image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:02.082789image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:02.307188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:02.514669image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:02.767988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:03.135999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:03.359375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:03.608739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:03.817150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:04.055512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:04.284943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:04.531272image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:04.773624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:04.993035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:05.220431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:05.454771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:05.699119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:05.979369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:06.277572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:06.570819image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:06.828100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:07.125335image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:07.373667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:07.673868image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:07.974060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:08.222405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:08.490653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:08.758936image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:09.179808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:09.540954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:09.963714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:10.392567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:10.732659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:11.008920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:11.281190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:11.553495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:11.837703image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:12.085040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:12.329389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:12.594679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:12.886897image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:13.139223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:13.374624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:13.635929image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:13.876282image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:14.117640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:14.352978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:14.607334image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:14.856668image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:15.087013image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:15.495920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:15.711375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:15.967764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:16.222230image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:16.465359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:16.713694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:16.946043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:17.179538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:17.416784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:17.667212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:17.941423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:18.160796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:18.378211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:18.584660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:18.852096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:19.084325image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:19.333659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:19.573017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:19.778468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:20.003896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:20.223278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:20.448675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:20.685043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:20.903459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:21.099935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:21.303390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:21.542749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:21.817018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:22.058371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:22.312693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:22.556040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:22.822359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:23.074655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:23.338946image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:23.620229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:23.880532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:24.128867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-25T00:20:24.364204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-25T00:21:04.630653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-25T00:21:05.072469image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-25T00:21:05.509269image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-25T00:21:05.989985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-25T00:21:06.692106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-25T00:20:25.216456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-07-25T00:20:29.291119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-07-25T00:20:30.264820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

OrderProperty IdProperty NameParent Property IdParent Property NameBBL - 10 digitsNYC Borough, Block and Lot (BBL) self-reportedNYC Building Identification Number (BIN)Address 1 (self-reported)Address 2Postal CodeStreet NumberStreet NameBoroughDOF Gross Floor AreaPrimary Property Type - Self SelectedList of All Property Use Types at PropertyLargest Property Use TypeLargest Property Use Type - Gross Floor Area (ft²)2nd Largest Property Use Type2nd Largest Property Use - Gross Floor Area (ft²)3rd Largest Property Use Type3rd Largest Property Use Type - Gross Floor Area (ft²)Year BuiltNumber of Buildings - Self-reportedOccupancyMetered Areas (Energy)Metered Areas (Water)ENERGY STAR ScoreSite EUI (kBtu/ft²)Weather Normalized Site EUI (kBtu/ft²)Weather Normalized Site Electricity Intensity (kWh/ft²)Weather Normalized Site Natural Gas Intensity (therms/ft²)Weather Normalized Source EUI (kBtu/ft²)Fuel Oil #1 Use (kBtu)Fuel Oil #2 Use (kBtu)Fuel Oil #4 Use (kBtu)Fuel Oil #5 & 6 Use (kBtu)Diesel #2 Use (kBtu)District Steam Use (kBtu)Natural Gas Use (kBtu)Weather Normalized Site Natural Gas Use (therms)Electricity Use - Grid Purchase (kBtu)Weather Normalized Site Electricity (kWh)Total GHG Emissions (Metric Tons CO2e)Direct GHG Emissions (Metric Tons CO2e)Indirect GHG Emissions (Metric Tons CO2e)Property GFA - Self-Reported (ft²)Water Use (All Water Sources) (kgal)Water Intensity (All Water Sources) (gal/ft²)Source EUI (kBtu/ft²)Release DateWater Required?DOF Benchmarking Submission StatusLatitudeLongitudeCommunity BoardCouncil DistrictCensus TractNTA
0113286201/20513286201/205101316000110131600011037549201/205 East 42nd st.Not Available100176753 AVENUEManhattan289356.0OfficeOfficeOffice293447Not AvailableNot AvailableNot AvailableNot Available19632100Whole BuildingNot AvailableNot Available305.6303.137.8Not Available614.2Not AvailableNot AvailableNot AvailableNot AvailableNot Available5.15506751E7Not AvailableNot Available38139374.21.10827705E76962.206962.2762051Not AvailableNot Available619.405/01/2017 05:32:03 PMNoIn Compliance40.750791-73.9739636.04.088.0Turtle Bay-East Midtown
1228400NYP Columbia (West Campus)28400NYP Columbia (West Campus)10213800401-02138-00401084198; 1084387;1084385; 1084386; 1084388; 1084389; 1807867; 1809824622 168th StreetNot Available10032180FT WASHINGTON AVENUEManhattan3693539.0Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)3889181Not AvailableNot AvailableNot AvailableNot Available196912100Whole BuildingWhole Building55229.8228.824.82.4401.1Not Available1.96248472E7Not AvailableNot AvailableNot Available-3.914148026E89330734419330734.43323659249.62613121E755870.451016.44854.13889181Not AvailableNot Available404.304/27/2017 11:23:27 AMNoIn Compliance40.841402-73.94256812.010.0251.0Washington Heights South
234778226MSCHoNY North28400NYP Columbia (West Campus)10213800301-02138-003010633803975 BroadwayNot Available100323975BROADWAYManhattan152765.0Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)231342Not AvailableNot AvailableNot AvailableNot Available19241100Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available000231342Not AvailableNot AvailableNot Available04/27/2017 11:23:27 AMNoIn Compliance40.840427-73.94024912.010.0251.0Washington Heights South
344778267Herbert Irving Pavilion & Millstein Hospital28400NYP Columbia (West Campus)10213900011-02139-00011087281; 1076746161 Fort Washington Ave177 Fort Washington Ave10032161FT WASHINGTON AVENUEManhattan891040.0Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)1305748Not AvailableNot AvailableNot AvailableNot Available19711100Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available0001305748Not AvailableNot AvailableNot Available04/27/2017 11:23:27 AMNoIn Compliance40.840746-73.94285412.010.0255.0Washington Heights South
454778288Neuro Institute28400NYP Columbia (West Campus)10213900851-02139-00851063403710 West 168th StreetNot Available10032193FT WASHINGTON AVENUEManhattan211400.0Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)179694Not AvailableNot AvailableNot AvailableNot Available19321100Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available000179694Not AvailableNot AvailableNot Available04/27/2017 11:23:27 AMNoIn Compliance40.841559-73.94252812.010.0255.0Washington Heights South
5628402NYP Cornell (East Campus)28402NYP Cornell (East Campus)10148000011-01480-00011084781; 1084780525 East 68th StreetNot Available100211176YORK AVENUEManhattan2230742.0Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)Hospital (General Medical & Surgical)2971874Not AvailableNot AvailableNot AvailableNot Available193212100Whole BuildingWhole Building55359.93598.34.8411.5Not Available2.00832154E7Not AvailableNot AvailableNot Available-4.690796909E81.4322508769E91.43225088E786335350.52.45508594E754429.877564.1-23134.32971874Not AvailableNot Available414.204/27/2017 11:23:27 AMNoIn Compliance40.761395-73.9577268.05.0116.0Lenox Hill-Roosevelt Island
674778352Annex Building & Garage28402NYP Cornell (East Campus)10148200401-01482-00401081252523 East 70th St515 East 70th St10021512EAST 71 STREETManhattan245000.0Mixed Use PropertyOtherOther245000Not AvailableNot AvailableNot AvailableNot Available1932160Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available000245000Not AvailableNot AvailableNot Available04/27/2017 11:23:27 AMYesIn Compliance40.765949-73.9537528.05.0124.0Lenox Hill-Roosevelt Island
7102610789North Shore Towers2610789North Shore Towers408489000140848900014456886;4456885;4453535;4456888270-10 Grand Central Parkway269-271-10 Grand Central Parkway11005269GRAND CENTRAL PKWYQueens3750565.0Multifamily HousingFinancial Office, Medical Office, Multifamily Housing, Non-Refrigerated Warehouse, Office, Other, Other - Entertainment/Public Assembly, Other - Public Services, Other - Recreation, Other - Services, Parking, Restaurant, Retail Store, Social/Meeting Hall, Supermarket/Grocery Store, Swimming PoolMultifamily Housing2400000Parking900000Other23020019744100Whole BuildingCombination of common and tenant areasNot Available143974.4143976Not Available1439.7151174.5Not Available1.37367028E7Not AvailableNot AvailableNot AvailableNot Available3942852421483.9428524215E9Not AvailableNot Available209434002094340002738875107151.539.13151172.904/28/2017 07:44:37 AMYesIn Compliance40.757859-73.71554313.023.0155101.0Glen Oaks-Floral Park-New Hyde Park
8112611745Towers Golf Course and Irrigation Wells2610789North Shore Towers408489000140848900014456888272-86 Grand Central ParkwayNot Available11005269GRAND CENTRAL PKWYQueens3750565.0OtherOtherOther200Not AvailableNot AvailableNot AvailableNot Available19741100Whole BuildingNot AvailableNot Available1138.31091.5319.9Not Available3427.3Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available227658.163979.921.1021.120019261.196305.693574.204/28/2017 07:44:37 AMYesIn Compliance40.757859-73.71554313.023.0155101.0Glen Oaks-Floral Park-New Hyde Park
9123616379North Shore Towers Bld 12610789North Shore Towers408489000140848900014456886271-10 Grand Central ParkwayNot Available11005269GRAND CENTRAL PKWYQueens3750565.0Multifamily HousingOtherOther2738875Not AvailableNot AvailableNot AvailableNot Available19741100Whole BuildingNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available000912892Not AvailableNot AvailableNot Available04/28/2017 07:44:37 AMYesIn Compliance40.757859-73.71554313.023.0155101.0Glen Oaks-Floral Park-New Hyde Park

Last rows

OrderProperty IdProperty NameParent Property IdParent Property NameBBL - 10 digitsNYC Borough, Block and Lot (BBL) self-reportedNYC Building Identification Number (BIN)Address 1 (self-reported)Address 2Postal CodeStreet NumberStreet NameBoroughDOF Gross Floor AreaPrimary Property Type - Self SelectedList of All Property Use Types at PropertyLargest Property Use TypeLargest Property Use Type - Gross Floor Area (ft²)2nd Largest Property Use Type2nd Largest Property Use - Gross Floor Area (ft²)3rd Largest Property Use Type3rd Largest Property Use Type - Gross Floor Area (ft²)Year BuiltNumber of Buildings - Self-reportedOccupancyMetered Areas (Energy)Metered Areas (Water)ENERGY STAR ScoreSite EUI (kBtu/ft²)Weather Normalized Site EUI (kBtu/ft²)Weather Normalized Site Electricity Intensity (kWh/ft²)Weather Normalized Site Natural Gas Intensity (therms/ft²)Weather Normalized Source EUI (kBtu/ft²)Fuel Oil #1 Use (kBtu)Fuel Oil #2 Use (kBtu)Fuel Oil #4 Use (kBtu)Fuel Oil #5 & 6 Use (kBtu)Diesel #2 Use (kBtu)District Steam Use (kBtu)Natural Gas Use (kBtu)Weather Normalized Site Natural Gas Use (therms)Electricity Use - Grid Purchase (kBtu)Weather Normalized Site Electricity (kWh)Total GHG Emissions (Metric Tons CO2e)Direct GHG Emissions (Metric Tons CO2e)Indirect GHG Emissions (Metric Tons CO2e)Property GFA - Self-Reported (ft²)Water Use (All Water Sources) (kgal)Water Intensity (All Water Sources) (gal/ft²)Source EUI (kBtu/ft²)Release DateWater Required?DOF Benchmarking Submission StatusLatitudeLongitudeCommunity BoardCouncil DistrictCensus TractNTA
11736149834950741ROSENBERG: 1955 Grand ConcourseNot Applicable: Standalone PropertyNot Applicable: Standalone Property20280800512-02808-0051\t20076881955 Grand ConcourseNot Available104611955GRAND CONCOURSEBronx59800.0Multifamily HousingMultifamily HousingMultifamily Housing67350Not AvailableNot AvailableNot AvailableNot Available19301100Whole BuildingNot Available83102106.54.60.9144.2Not Available277173Not AvailableNot AvailableNot AvailableNot Available5511800.558436.71081157307746.1413.7313.3100.3673508862131.58140.504/26/2017 04:59:11 PMYesNaN40.850655-73.9050485.014.023501.0Mount Hope
11737149844950728ROSENBERG: 1480 Popham AveNot Applicable: Standalone PropertyNot Applicable: Standalone Property20287702112-02877-0211\t20088741480 Popham AveNot Available104611480POPHAM AVENUEBronx60480.0Multifamily HousingMultifamily HousingMultifamily Housing72372Not AvailableNot AvailableNot AvailableNot Available19401100Whole BuildingNot Available8395101.450.8142.2Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available5593400.160973.71284867.1363247.5416.3297.1119.2723727215.499.7136.904/26/2017 04:59:11 PMNoNaN40.847884-73.9218915.014.020501.0University Heights-Morris Heights
11738149854408791Milton Gordon: 679 WEST 239 STREETNot Applicable: Standalone PropertyNot Applicable: Standalone Property20592005102-05920-0510\t2085873679 WEST 239 STREETNot Available10463679WEST 239 STREETBronx125526.0Multifamily HousingMultifamily HousingMultifamily Housing131802Not AvailableNot AvailableNot AvailableNot Available19601100Whole BuildingNot Available10012.211.82.8032.5Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available2916002973.41315977.7370162137.615.5122.11318022989.622.6833.705/01/2017 01:22:10 PMNoNaN40.889859-73.9142548.011.0309.0North Riverdale-Fieldston-Riverdale
11739149864408781Milton Gordon: 699 WEST 239 STREETNot Applicable: Standalone PropertyNot Applicable: Standalone Property20592006872-05920-0687\t2085876699 WEST 239 STREETNot Available10463699WEST 239 STREETBronx162000.0Multifamily HousingMultifamily HousingMultifamily Housing170100Not AvailableNot AvailableNot AvailableNot Available19621100Whole BuildingWhole Building997582.5Not Available0.886.6Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available1.27625995E7140257.4Not AvailableNot Available677.9677.90170100Not AvailableNot Available78.804/26/2017 04:32:19 PMNaNNaN40.889914-73.9149638.011.0309.0North Riverdale-Fieldston-Riverdale
11740149874940405Advanced: 161 Henry StreetNot Applicable: Standalone PropertyNot Applicable: Standalone Property30023700083-00237-0008\t3001882161 Henry StreetNot Available11201161HENRY STREETBrooklyn51110.0Multifamily HousingMultifamily HousingMultifamily Housing53665Not AvailableNot AvailableNot AvailableNot Available19061100Whole BuildingWhole Building867579.94.10.7112.9Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available3260900.135428.7765468.5218353.1244.2173.271536655122.895.46108.604/26/2017 11:20:06 AMNaNNaN40.695759-73.9938262.033.0502.0Brooklyn Heights-Cobble Hill
11741149884940453Advanced: 24 Monroe PlaceNot Applicable: Standalone PropertyNot Applicable: Standalone Property30023800263-00238-0026\t300192724 Monroe PlaceNot Available1121822MONROE PLACEBrooklyn70645.0Multifamily HousingMultifamily HousingMultifamily Housing74177Not AvailableNot AvailableNot AvailableNot Available19281100Whole BuildingWhole Building9881.4873.30.8114.9Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available5179599.956168.9857323245251.1354.7275.179.6741772308.531.12109.604/26/2017 11:20:06 AMNaNNaN40.696420-73.9924952.033.0502.0Brooklyn Heights-Cobble Hill
11742149894940416Advanced: 150 Joralemon St / 124 Clinton StNot Applicable: Standalone PropertyNot Applicable: Standalone Property30026400173-00264-0017\t3002539150 Joralemon StreetNot Available11201130CLINTON STREETBrooklyn93500.0Multifamily HousingMultifamily Housing, OtherMultifamily Housing93500Other7791Not AvailableNot Available19261100Whole BuildingNot Available16109.3117.75.30.9160.8Not Available565800Not AvailableNot AvailableNot AvailableNot Available8616799.8953501885508.7532672.7674.6499.71751012914216.841.63153.404/26/2017 11:20:06 AMNaNNaN40.692602-73.9932312.033.07.0Brooklyn Heights-Cobble Hill
11743149904628296(9267) - 267 Sixth StNot Applicable: Standalone PropertyNot Applicable: Standalone Property3009870001​3-00987-0001​3413788; 3021326267 6th StreetNot Available11215NaNNaNNaNNaNMultifamily HousingMultifamily HousingMultifamily Housing103328Not AvailableNot AvailableNot AvailableNot Available1913195Whole BuildingNot Available8344.343.97.70.2101Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available1761629.8181312818854.7796837.2355.293.6261.6103328Not AvailableNot Available103.603/23/2017 02:51:02 PMNaNNaNNaNNaNNaNNaNNaNNaN
11744149914940464Advanced: 27 Prospect Park WestNot Applicable: Standalone PropertyNot Applicable: Standalone Property30107200403-01072-0040\t3024968;382468027 Prospect Park WestNot Available1121527PROSPECT PARK WESTBrooklyn57824.0Multifamily HousingMultifamily HousingMultifamily Housing60715Not AvailableNot AvailableNot AvailableNot Available19281100Whole BuildingWhole Building927074.92.60.796.9Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available3693499.840162.6556759.9155319.3247.8196.251.7607151687.627.892.704/26/2017 11:20:06 AMNaNNaN40.670728-73.9717526.039.0165.0Park Slope-Gowanus
11745149934952165Tryad: 420 Clinton AveNot Applicable: Standalone PropertyNot Applicable: Standalone Property30196000223-01960-0022\t3055969420 Clinton Ave\tNot Available11238419VANDERBILT AVENUEBrooklyn60720.0Multifamily HousingMultifamily HousingMultifamily Housing63756Not AvailableNot AvailableNot AvailableNot Available19301100Whole BuildingNot Available19130.2140.53.41.3172Not AvailableNot AvailableNot AvailableNot AvailableNot AvailableNot Available7528499.882151.4770105.4218287.2471.3399.971.563756Not AvailableNot Available161.904/27/2017 06:37:53 AMYesNaN40.685549-73.9683102.035.0199.0Clinton Hill